A question that I have been hearing recently from customers using Azure Synapse Analytics (the public preview version) is what is the difference between using an external table versus T-SQL view on a file in a data lake?
Note that a T-SQL view and an external table pointing to a file in a data lake can be created in both a SQL Provisioned pool as well as a SQL On-demand pool.
Here are the differences that I have found:
I often get asked what is the difference in performance when it comes to querying using an external table or view against a file in ADLS Gen2 vs. querying against a highly compressed table in a SQL Provisioned pool (i.e. managed table). It’s hard to quantify without understanding more about each customers scenario, but you will roughly see a 5X performance difference between queries over external tables and views vs. managed tables (obviously, depending on the query, that will vary but that’s a rough number – could be more than 5X in some scenarios). A few things that contribute to that: in-memory caching, SSD based caches, result-set caching, and the ability to design and align data and tables when they are stored as managed tables. You can also create materialized views for managed tables which typically bring lots of performance improvements as well. If you are querying Parquet data, that is in columnstore file format with compression so that would give you similar data/column elimination as what managed SQL clustered columnstore index (CCI) would give, but if you are querying non-Parquet files you do not get this functionality. Note that for managed tables, on top of performance, you also get a granular security model, workload management capabilities, and so on (see Data Lakehouse & Synapse).
The post External tables vs T-SQL views on files in data lake first appeared on James Serra’s Blog.
The post External tables vs T-SQL views on files in data lake appeared first on SQLServerCentral.
I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…
Once we learn how to deploy an Ubuntu server, how to manage users, and how…
Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…
While developing a web application, or setting dynamic pages and meta tags we need to deal with…
Software architecture is one of the most discussed topics in the software industry today, and…