{"id":265,"date":"2022-05-01T20:00:02","date_gmt":"2022-05-01T23:00:02","guid":{"rendered":"https:\/\/gestiaconsultores.com.ar\/en\/?p=265"},"modified":"2022-12-01T17:11:37","modified_gmt":"2022-12-01T20:11:37","slug":"connection-types-in-power-bi","status":"publish","type":"post","link":"https:\/\/gestiaconsultores.com.ar\/en\/news\/connection-types-in-power-bi\/","title":{"rendered":"Connection types in Power BI: Live Connection, DirectQuery, or Import Data."},"content":{"rendered":"<p><img decoding=\"async\" src=\"https:\/\/i0.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_12h13_49.png?resize=640%2C401\" alt=\"2017-09-13_12h13_49\" \/><\/p>\n<p>Power BI supports three methods for connecting to data sources, called Live Connection, DirectQuery, and Import Data (also called Scheduled Refresh). In the following points we will analyze what each of these types means, their characteristics, pros, cons, and which one to choose in each situation.<\/p>\n<p>Determining which type of connection to choose between these (or Composite mode that integrates DirectQuery + Import Data) is a major decision in defining the architecture of a Power BI solution, since it is expensive to change the connection type once when one of the ways is defined.<\/p>\n<h2 id=\"h-what-is-import-data-or-scheduled-refresh\">Import Data<\/h2>\n<p>This method is called &#8220;Import Data&#8221; because the data is loaded into Power BI. Some call it \u201cScheduled Refresh\u201d since you must schedule the upload and refresh of data in the Power BI service. Loading the data into Power BI means consuming memory and disk space, either on the development machine&#8217;s memory and disk space (when using Power BI Desktop) or in the Power BI service (when the report is published to the cloud). The greater the number of tables and data, the greater the use of resources.<\/p>\n<p>There is no simple, straight forward calculation to know how much your source data is going to occupy in Power BI memory (i.e., it is not rows by columns by data type), as Power BI uses the <strong>xVelocity<\/strong> compression engine, which compresses the data and stores them in a compressed format. A 1 GB Excel file might be as small as 10 MB in Power BI, depending on the number of unique values in the column, data types, and other variables. In short, the size of the data in Power BI will typically be much smaller than the size of the data source.<\/p>\n<h3 id=\"h-important-pros-and-cons-of-this-method\">Import data important pros and cons<\/h3>\n<h4 id=\"h-power-bi-full-functional\">PRO \u2013 Power BI 100% functional<\/h4>\n<p>With this method you can use all Power BI features and components, both Power Query to combine data from multiple sources and DAX to write advanced time intelligence expressions or visualizations. There are no limitations on Power BI functionality with this method.<\/p>\n<h4 id=\"h-power-bi-full-functional\">CON \u2013 Size limitation<\/h4>\n<p>Size and capacity limits apply with this method. On the one hand, each model or file in Power BI cannot occupy more than 1 GB (with Premium capacity or Premium licenses this limit is 100 GB), always considering the compressed model. Then, you have a size of up to 10 GB in the account, while Power BI Premium has a limit of 100 TB. If using Power BI Report Server (the on-premises version of the Power BI service), the size limit is 2 GB per file.<\/p>\n<h4>PRO \u2013 Fastest method<\/h4>\n<p>This connection method is the fastest viable option. Report queries will be evaluated against the uploaded data and used from server memory, with no latency issues.<\/p>\n<h2 id=\"h-what-is-directquery\">DirectQuery<\/h2>\n<p>DirectQuery is a direct connection to the data source. The data is NOT stored in Power BI, so Power BI is primarily a visualization layer and queries the data from the data source whenever it requires it. Power BI only stores the metadata of the tables (table names, column names, relationships), but not the data. The Power BI file size will be much smaller and will most likely never reach the size limit because there is no data stored in the model.<\/p>\n<p>DirectQuery is only feasible for certain data sources. Here are a few (full list in <a title=\"Power BI data sources\" href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/power-bi-data-sources\" target=\"_blank\" rel=\"nofollow noopener\">Power BI data sources)<\/a>:<\/p>\n<ul>\n<li>Amazon Redshift<\/li>\n<li>Azure HDInsight Spark<\/li>\n<li>Azure SQL Database<\/li>\n<li>Azure SQL Datawarehouse<\/li>\n<li>IBM Netezza<\/li>\n<li>Impala<\/li>\n<li>Oracle Database<\/li>\n<li>SAP Business Warehouse<\/li>\n<li>SAP HANA<\/li>\n<li>Snowflake<\/li>\n<li>Spark<\/li>\n<li>SQL Server Database<\/li>\n<li>Teradata Database<\/li>\n<\/ul>\n<h3 id=\"h-important-pros-and-cons-of-this-method-1\">Important pros and cons of DirectQuery<\/h3>\n<h4 id=\"h-scaleability-the-main-advantage\">PRO \u2013 Scalability<\/h4>\n<p>This method has no size limitation, because the data is not stored in the Power BI file. Data sources with petabytes of data can be used in SQL Server, Oracle, or any other supported data source and connected from Power BI.<\/p>\n<h4 id=\"h-limited-functionality-few-power-query-operations-mainly-visualization\">CON \u2013 Limited functionality<\/h4>\n<p>This method does not make the full functionality of Power BI available. As seen in the image below, there will only be two tabs in Power BI Desktop: Report (for visualization design) and Model (for changing relationships if necessary).<\/p>\n<figure class=\"wp-block-image\"><a href=\"https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_10h22_06.png\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-6266\" src=\"https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_10h22_06.png?resize=447%2C490\" sizes=\"(max-width: 447px) 100vw, 447px\" srcset=\"https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_10h22_06.png?w%3D447%26ssl%3D1 447w,https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_10h22_06.png?resize%3D274%252C300%26ssl%3D1 274w\" alt=\"2017-09-13_10h22_06\" width=\"447\" height=\"490\" data-recalc-dims=\"1\" \/><\/a><\/figure>\n<p>Power Query is limited to folding operations (more information in <a title=\"Power Query query folding\" href=\"https:\/\/docs.microsoft.com\/en-us\/power-query\/power-query-folding\" target=\"_blank\" rel=\"nofollow noopener\">Power Query query folding<\/a>), but DAX can be used in the display layer.<\/p>\n<h4 id=\"h-slow-connection\">CON \u2013 Slow connection<\/h4>\n<p>A big disadvantage of this method is that the connection is slower than other types of connection. This is because in a report, made up of multiple visuals, each visual sends a query to the data sources, and then the data comes back. It is particularly important to evaluate the latencies between the servers and analyze the performance of the data sources, making performance adjustments on them.<\/p>\n<h2 id=\"h-what-is-live-connection\">Live connection<\/h2>\n<p>Live Connection is like DirectQuery in the way it works with the data source since the data is not stored in Power BI and the data sources are queried each time. Live Connection is only compatible with these data sources:<\/p>\n<ul>\n<li>Azure Analysis Services<\/li>\n<li>SQL Server Analysis Services (SSAS) Tabular<\/li>\n<li>SQL Server Analysis Services (SSAS) Multi-Dimensional<\/li>\n<li>Power BI Dataset in the Service<\/li>\n<\/ul>\n<p>Because these data sources are modeling engines themselves, Power BI just connects to them and gets all the model metadata (table names, column names, relationships). With this method, Power BI is used for visualization only, and all modeling requirements must be performed at the data source.<\/p>\n<h3 id=\"h-important-pros-and-cons-of-this-method-2\">Important Pros and Cons of Live Connection<\/h3>\n<h4 id=\"h-slow-connection\">PRO \u2013 Large model<\/h4>\n<p>The great benefit of this model is that you can have a large data model and you can also take advantage of the SSAS (SQL Server Analysis Services) modeling layer, if you use the Tabular engine, you will have DAX, and if you use the OLAP Multidimensional model, you will have MDX. With any of these two languages, all calculation and modeling needs can be covered. This method has better modeling capabilities than DirectQuery.<\/p>\n<h4 id=\"h-no-power-query-just-visualization\">CON \u2013 Visualization only<\/h4>\n<p>The big disadvantage of this method is that you do not have the Power Query transformation. As seen in the image, in Power BI Desktop there is only the Report tab (for visualization design).<\/p>\n<figure class=\"wp-block-image\"><a href=\"https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_10h39_56.png\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-6269\" src=\"https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_10h39_56.png?resize=640%2C330\" sizes=\"(max-width: 640px) 100vw, 640px\" srcset=\"https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_10h39_56.png?w%3D788%26ssl%3D1 788w,https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_10h39_56.png?resize%3D300%252C155%26ssl%3D1 300w\" alt=\"2017-09-13_10h39_56\" width=\"640\" height=\"330\" data-recalc-dims=\"1\" \/><\/a><\/figure>\n<h4 id=\"h-report-level-measures\">PRO \u2013 Measurements at report level<\/h4>\n<p>As a complement to the model made in the data source, with this type of connection DAX measurements can be made at the report level. However, it is not the most appropriate, since the measures will only be visible in the report where they are developed and will not remain in the model for reuse.<\/p>\n<figure class=\"wp-block-image\"><a href=\"https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_10h44_36.png\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-6270\" src=\"https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_10h44_36.png?resize=640%2C344\" sizes=\"(max-width: 640px) 100vw, 640px\" srcset=\"https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_10h44_36.png?w%3D754%26ssl%3D1 754w,https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_10h44_36.png?resize%3D300%252C161%26ssl%3D1 300w\" alt=\"2017-09-13_10h44_36\" width=\"640\" height=\"344\" data-recalc-dims=\"1\" \/><\/a><\/figure>\n<h2>Composite model<\/h2>\n<div class=\"wp-block-group\">\n<div class=\"wp-block-group__inner-container\">\n<p>Power BI allows you to generate composite models that consist of the combination of DirectQuery sources with Import Data data sources, delivering the performance and flexibility of Import Data, along with the scalability and large data size of DirectQuery.<\/p>\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/i1.wp.com\/radacad.com\/wp-content\/uploads\/2018\/07\/2018-07-23_16h47_42.png?resize=640%2C471\" alt=\"\" \/><\/figure>\n<p>Composite models not only support DirectQuery for data sources such as SQL Server and other DirectQuery sources but are also supported for Power BI datasets (dataflows or datasets), allowing you to create a chained dataset from the Power BI main dataset.<\/p>\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/i0.wp.com\/radacad.com\/wp-content\/uploads\/2020\/12\/2020-12-17_09h34_24.png?resize=640%2C369&amp;ssl=1\" alt=\"\" \/><figcaption><\/figcaption><\/figure>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<h2 id=\"h-what-are-architecture-scenarios-to-use-for-each-method\">What are the architecture scenarios to use for each method?<\/h2>\n<h3 id=\"h-import-data-for-agility-and-performance\">Import data for agility and performance<\/h3>\n<p>Using Import Data you will have a fully functional Power BI with exceptional performance, so if your data set is not huge then you can easily use this method and produce reports in an extremely fast development period.<\/p>\n<h3 id=\"h-live-connection-for-enterprise-solution\">Live connection for business solutions<\/h3>\n<p>Many companies already have pre-built models in tabular or multidimensional SSAS. These models can be easily used in Power BI Live Connection. Even if an enterprise has not yet started the AAS (Azure Analysis Services) or SSAS solution, and is dealing with a large data set, this option is better than Direct Query, since SSAS has MDX or DAX analytical expression languages for cope with multiple calculation and modeling challenges, plus higher performance. Here is a sample architecture that can be used with this method.<\/p>\n<p class=\"wp-block-image\"><a href=\"https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i0.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_11h36_40.png\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-6271\" src=\"https:\/\/i0.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_11h36_40.png?resize=640%2C315\" sizes=\"(max-width: 640px) 100vw, 640px\" srcset=\"https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i0.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_11h36_40.png?w%3D1419%26ssl%3D1 1419w,https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i0.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_11h36_40.png?resize%3D300%252C148%26ssl%3D1 300w,https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i0.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_11h36_40.png?resize%3D1024%252C504%26ssl%3D1 1024w,https:\/\/translate.google.com\/website?sl=en&amp;tl=es&amp;hl=en&amp;client=webapp&amp;u=https:\/\/i0.wp.com\/radacad.com\/wp-content\/uploads\/2017\/09\/2017-09-13_11h36_40.png?w%3D1280%26ssl%3D1 1280w\" alt=\"2017-09-13_11h36_40\" width=\"640\" height=\"315\" data-recalc-dims=\"1\" \/><\/a><\/p>\n<p>Additionally, in a business environment there are important benefits to using Live Connection. If you are working on an analysis project with multiple report authors and team members, the dataset can be shared with a live connection, and the report authors can consume the dataset simultaneously.<\/p>\n<p class=\"wp-block-image\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/radacad.com\/wp-content\/uploads\/2017\/04\/2017-04-10_15h04_21.png?w=640\" alt=\"2017-04-10_15h04_21\" width=\"640\" height=\"525\" \/><\/p>\n<h3>DirectQuery for non-Microsoft sources<\/h3>\n<p>The DirectQuery connection is not widely used in Microsoft&#8217;s solution architecture. The main reason is that if you have a Microsoft-based solution architecture, you will probably use SSAS as mentioned above to take advantage of their analytics engine. DirectQuery mode is mainly used in non-Microsoft architecture, when the data model is managed in Oracle system, MySQL, SAP HANA, or something like that.<\/p>\n<h2 id=\"h-summary\">Summary<\/h2>\n<p>In summary, the three types of connection have been described: Live Connection, Import Data and DirectQuery; along with their differences, pros and cons, and the scenarios in which each should be used. This is an introduction, and more information can be found about it.<\/p>\n<p>&nbsp;<\/p>\n<p><strong>Sources:<\/strong><\/p>\n<ul>\n<li><a href=\"https:\/\/radacad.com\/directquery-live-connection-or-import-data-tough-decision\" target=\"_blank\" rel=\"noopener sponsored\">Power BI Connection Types: DirectQuery, Live, or Import? Tough Decision!<\/a><\/li>\n<li><a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/power-bi-data-sources\" target=\"_blank\" rel=\"noopener\">Power BI data sources<\/a><\/li>\n<li><a href=\"https:\/\/radacad.com\/not-folding-the-black-hole-of-power-query-performance\" target=\"_blank\" rel=\"noopener\">Not Folding; the Black Hole of Power Query Performance<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Power BI supports three methods for connecting to data sources, called Live Connection, DirectQuery, and Import Data (also called Scheduled Refresh). In the following points we will analyze what each of these types means, their characteristics, pros, cons, and which one to choose in each situation. Determining which type of connection to choose between these (or Composite mode that integrates DirectQuery + Import Data) is a major decision in defining the architecture of a Power BI solution, since it is expensive to change the connection type once when one of the ways is defined. Import Data This method is called &#8220;Import Data&#8221; because the data is loaded into Power BI. Some call it \u201cScheduled Refresh\u201d since you must schedule the upload and refresh of data in the Power BI service. Loading the data into Power BI means consuming memory and disk space, either on the development machine&#8217;s memory and disk space (when using Power BI Desktop) or in the Power BI service (when the report is published to the cloud). The greater the number of tables and data, the greater the use of resources. There is no simple, straight forward calculation to know how much your source data is going [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_joinchat":[]},"categories":[31,32,27,1],"tags":[49,46,47,45,44,48],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/gestiaconsultores.com.ar\/en\/wp-json\/wp\/v2\/posts\/265"}],"collection":[{"href":"https:\/\/gestiaconsultores.com.ar\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/gestiaconsultores.com.ar\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/gestiaconsultores.com.ar\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/gestiaconsultores.com.ar\/en\/wp-json\/wp\/v2\/comments?post=265"}],"version-history":[{"count":8,"href":"https:\/\/gestiaconsultores.com.ar\/en\/wp-json\/wp\/v2\/posts\/265\/revisions"}],"predecessor-version":[{"id":838,"href":"https:\/\/gestiaconsultores.com.ar\/en\/wp-json\/wp\/v2\/posts\/265\/revisions\/838"}],"wp:attachment":[{"href":"https:\/\/gestiaconsultores.com.ar\/en\/wp-json\/wp\/v2\/media?parent=265"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/gestiaconsultores.com.ar\/en\/wp-json\/wp\/v2\/categories?post=265"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/gestiaconsultores.com.ar\/en\/wp-json\/wp\/v2\/tags?post=265"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}