SQL Server 2019 is SQL for big data

Blog|by Mary Branscombe|7 November 2018

The big news with the SQL Server 2017 release was support for running on Linux and in containers, graph queries, and running machine learning where your data is using R and Python. A year later, the CTP 2.0 preview of SQL Server 2019 announced at the Ignite conference also goes beyond the familiar relational database, with a new architecture that combines the SQL Server database engine, Apache Spark and Hadoop distributed file system (HDFS) support as a hybrid platform for big data, so you can connect to relational, NoSQL and big data sources and work with them all in a unified way.

This gives SQL Server a distributed architecture, where you can pick and mix the elements that best suit your data needs. The Spark engine is now part of SQL Server, so you can combine SQL compute nodes with either SQL or HDFS storage nodes depending on whether you need relational tables or a data lake, using Spark for data science, advanced analytics and machine learning tasks, and have SQL Server and Spark running in the same Kubernetes container deployment.

With SQL Server 2016, the Docker container support has sometimes been seen more as a way to speed up deployment initially for development and test; but the Kubernetes support in SQL Server 2019 is much broader, supporting the features needed for production deployments.

These new SQL Server Big Data Clusters create an elastic scale-out data virtualisation platform where you can deploy both SQL and Spark Linux containers on Kubernetes, including deploying SQL Server Availability Groups in Kubernetes. To do that, you first deploy an operator role into Kubernetes that orchestrates the deployment of pods, connects to them and then orchestrates the full deployment of the availability group onto that pod deployment – which allows for rolling upgrades to apply updates with less downtime. If you need to provide a quorum within the availability group, Microsoft is working on an open source Paxos implementation (which will be available on GitHub) that will provide a similar architecture to failover cluster instances. The big data clusters are in a limited public preview; you have to register and request access.

Whether you’re moving to the new big data clusters or sticking to a conventional SQL Server architecture, Polybase still gives you more connectivity in the 2019 release. Polybase still supports Spark, Hortonworks and Cloudera Hadoop, but there are new connectors to query Oracle, Teradata and MongoDB (including Cosmos DB), as well as generic ODBC data sources (like DB2, SAP HANA and even Excel) and even other SQL Server databases directly from SQL Server without needing to move or replicate it, making it much faster to generate reports that need information from external tables.

The integrated security tier in SQL Server covers the Spark and HDFS integration, protecting data at rest and in motion with the Always Encrypted option (which requires secure enclaves on your servers but now allows more complex operations), plus built-in data discovery, classification and auditing (which can now log the sensitivity classification of data returned in a query) across all data stores. SQL Server Configuration Manager now includes certificate management for deploying certificates to failover clusters and Always On Availability Groups, and viewing installed certificates (including a handy view showing certificates that will expire soon).

SQL Server 2019 at a glance. Source: Microsoft

Intelligent Query Processing takes the automated performance tuning of Adaptive Query Processing in SQL Server 2017 further, building on the performance tuning that’s done in SQL Azure. Choose the new 150 database compatibility level to have your query performance automatically tuned either at runtime, or based on analysing past performance. Adjusting the memory used for a query based on past performance has already worked for batch execution with columnstore indexes; now it’s available for all queries, and batch execution works for row stores. Lightweight query profiling is now turned on by default, so you can look back and understand query performance or look at live queries running for troubleshooting without needing to turn on extra data collection for diagnostics.

The core SQL Server engine gets some updates too, including UTF-8 encoding support (which could save a significant amount of storage) and online index build and rebuild when you convert row-store tables into the clustered columnstores useful for analytics (previously, the database had to be paused while the clustered columnstore index was created but now you can carry on working against the database while the conversion happens). Developers will be able to pause and resume the creation of an online index rather than starting from the beginning if it’s interrupted (or the database runs out of space). That can be set as the default for a database, if necessary.

There are enhancements to the graph queries introduced in SQL Server 2017; you can specify graph relationships in a single statement rather than needing separate insert, update and delete statements. You can also insert a new edge or update a merge between two nodes with a single statement using the new match options in a merge statement. By default, edge tables can connect any two nodes in the database; edge constraints allow you to limit the type of nodes an edge table can connect to.

SQL Server for Linux 2019

Now that HDFS is supported natively in SQL Server, developers can bring data from multiple sources for machine learning data model training and operationalise that model in a single system. If T-SQL doesn’t have all the features needed, Java joins Python and R as languages that you can execute in-place inside SQL Server.

The Machine Learning Services component that supports Java, Python and R now runs on SQL Server on Linux, not just Windows Server, giving developers a much wider choice of languages and environments. However, on Windows Server, Machine Learning Services can now be deployed into failover clusters for availability.

SQL Server for Linux catches up on a few other missing features in SQL Server 2019: notably replication (transactional, snapshot or merge) and distributed transactions (with support for the Microsoft Distributed Transaction Coordinator (MSDTC)). New OpenLDAP support allows third-party AD providers to join your domain, simplifying access management.

As persistent (or storage class) memory like Intel Optane starts to arrive in production systems, it’s ideal for in-memory databases. SQL Server on Windows has supported persistent memory since SQL Server 2016 and in SQL Server 2019 on Windows Server 2019 database objects can be stored on persistent memory using standard block-based storage. But in SQL Server for Linux you can now bypass the Linux storage stack and access persistent memory devices directly to get lower-latency IO with SQL database files, transaction logs and in-memory OLTP checkpoint files.

If you’re looking for the latest version of SQL Operations Studio to work with SQL Server across multiple platform and manage all these options, the name is changing to Azure Data Studio, because it’s becoming more modular and supports multiple data sources – including SQL Server 2019 as well as SQL Azure. It also has a notebook experience for running Query books. There’s also a preview of SQL Server Management Studio 18.0 for configuring and administering SQL Server components, so the tools are there to help you try out the previews of SQL Server 2019.

Grey Matter has a team of SQL licensing specialists who can help you with your database questions. They’re available to call: +44 (0)1364 654100 or email: licensing@greymatter.com

7 November 2018 | Blog

Contact Grey Matter

If you have any questions or want some extra information, complete the form below and one of the team will be in touch ASAP. If you have a specific use case, please let us know and we'll help you find the right solution faster.

By submitting this form you are agreeing to our Privacy Policy and Website Terms of Use.

Mary Branscombe

Mary Branscombe is a freelance tech journalist. Mary has been a technology writer for nearly two decades, covering everything from early versions of Windows and Office to the first smartphones, the arrival of the web and most things in between.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Analytics" category.
cookielawinfo-checkbox-functional	1 year	The GDPR Cookie Consent plugin sets the cookie to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Necessary" category.
cookielawinfo-checkbox-others	1 year	Set by the GDPR Cookie Consent plugin, this cookie stores user consent for cookies in the category "Others".
cookielawinfo-checkbox-performance	1 year	Set by the GDPR Cookie Consent plugin, this cookie stores the user consent for cookies in the category "Performance".
CookieLawInfoConsent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
csrftoken	1 year	This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks
JSESSIONID	session	New Relic uses this cookie to store a session identifier so that New Relic can monitor session counts for an application.
SRCHD	1 year 24 days	Bing sets this cookie to display map content using Bing Maps.
SRCHUID	1 year 24 days	Bing sets this cookie to display map content using Bing Maps.
SRCHUSR	1 year 24 days	Bing sets this cookie to display map content using Bing Maps.
viewed_cookie_policy	1 year	The GDPR Cookie Consent plugin sets the cookie to store whether or not the user has consented to use cookies. It does not store any personal data.

Cookie	Duration	Description
_an_uid	7 days	No description available.
_cfuvid	session	Description is currently not available.
6suuid	1 year 1 month 4 days	No description available.
AN	1 month	No description available.
AS	session	No description available.
debug	never	No description available.
ebEventToTrack	1 month	No description available.
eblang	1 year	No description available.
gm_country_code	7 days	Description is currently not available.
guest	1 month	No description available.
JOTFORM_SESSION	1 month	No description available.
loglevel	never	No description available.
receive-cookie-deprecation	1 year 1 month 4 days	Description is currently not available.
SP	session	Description is currently not available.
SRCHHPGUSR	1 year 24 days	No description available.
SS	session	Description is currently not available.
stableId	1 year	Description is currently not available.
TESTCOOKIESENABLED	1 minute	Description is currently not available.
userReferer	1 month	No description available.
VISITOR_PRIVACY_METADATA	6 months	Description is currently not available.
zoom	never	No description available.

Cookie	Duration	Description
_SS	session	Bing sets this cookie to collect information on how visitors behave on multiple websites and to understand how they access the website, to provide relevant ads.
ANONCHK	10 minutes	The ANONCHK cookie, set by Bing, is used to store a user's session ID and verify ads' clicks on the Bing search engine. The cookie helps in reporting and personalization as well.
bcookie	1 year	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser IDs.
bscookie	1 year	LinkedIn sets this cookie to store performed actions on the website.
fr	3 months	Facebook sets this cookie to show relevant advertisements by tracking user behaviour across the web, on sites with Facebook pixel or Facebook social plugin.
guest_id	1 year 1 month	Twitter sets this cookie to identify and track the website visitor. It registers if a user is signed in to the Twitter platform and collects information about ad preferences.
IDE	1 year 24 days	Google DoubleClick IDE cookies store information about how the user uses the website to present them with relevant ads according to the user profile.
li_sugr	3 months	LinkedIn sets this cookie to collect user behaviour data to optimise the website and make advertisements on the website more relevant.
mgref	1 year	This cookie is set by Eventbrite to deliver content tailored to the end user's interests and improve content creation. It is also used for event-booking purposes.
muc_ads	1 year 1 month 4 days	Twitter sets this cookie to collect user behaviour and interaction data to optimize the website.
MUID	1 year 24 days	Bing sets this cookie to recognise unique web browsers visiting Microsoft sites. This cookie is used for advertising, site analytics, and other operations.
personalization_id	1 year 1 month 4 days	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
SUID	12 hours	Google Analytics sets this cookie to collect data on user preferences and/or interaction with web campaign content (Microsoft).
test_cookie	15 minutes	doubleclick.net sets this cookie to determine if the user's browser supports cookies.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.
VISITOR_INFO1_LIVE	5 months 27 days	YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface.
YSC	session	Youtube sets this cookie to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt-remote-device-id	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt.innertube::nextId	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
_clck	1 year	Microsoft Clarity sets this cookie to retain the browser's Clarity User ID and settings exclusive to that website. This guarantees that actions taken during subsequent visits to the same website will be linked to the same user ID.
_clsk	1 day	Microsoft Clarity sets this cookie to store and consolidate a user's pageviews into a single session recording.
_fbp	3 months	Facebook sets this cookie to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising after visiting the website.
_ga	1 year 1 month 4 days	Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_ga_*	1 year 1 month 4 days	Google Analytics sets this cookie to store and count page views.
_gat_UA-*	1 minute	Google Analytics sets this cookie for user behaviour tracking.
_gcl_au	3 months	Google Tag Manager sets the cookie to experiment advertisement efficiency of websites using their services.
_gd_session	4 hours	This cookie is used for collecting information on users visit to the website. It collects data such as total number of visits, average time spent on the website and the pages loaded.
_gd_svisitor	1 year 1 month 4 days	This cookie is set by the Google Analytics. This cookie is used for tracking the signup commissions via affiliate program.
_gd_visitor	1 year 1 month 4 days	This cookie is used for collecting information on the users visit such as number of visits, average time spent on the website and the pages loaded for displaying targeted ads.
_gid	1 day	Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously.
_s	1 year	This cookie is associated with Shopify's analytics suite.
ajs_anonymous_id	never	This cookie is set by Segment to count the number of people who visit a certain site by tracking if they have visited before.
ajs_group_id	never	This cookie is set by Segment to track visitor usage and events within the website.
ajs_user_id	never	This cookie is set by Segment to help track visitor usage, events, target marketing, and also measure application performance and stability.
AnalyticsSyncHistory	1 month	Linkedin set this cookie to store information about the time a sync took place with the lms_analytics cookie.
CLID	1 year	Microsoft Clarity set this cookie to store information about how visitors interact with the website. The cookie helps to provide an analysis report. The data collection includes the number of visitors, where they visit the website, and the pages visited.
CONSENT	2 years	YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data.
ln_or	1 day	Linkedin sets this cookie to registers statistical data on users' behaviour on the website for internal analytics.
MR	7 days	This cookie, set by Bing, is used to collect user information for analytics purposes.
MUIDB	1 year 24 days	Bing sets this cookie to determine how the user uses the website and any advertising that the end user may have seen before visiting the said website.
SM	session	Microsoft Clarity cookie set this cookie for synchronizing the MUID across Microsoft domains.
vuid	1 year 1 month 4 days	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos on the website.
wow.anonymousId	1 year 1 month 4 days	This is a analytic cookie used to store anonymous visitor ID. It tracks the visitor uniquely between visits.
wow.session	20 minutes	This cookie is set by the provider Communigator.This cookie is used to track the Internet Information Services(IIS) session state.
wow.utmvalues	20 minutes	This cookie is from Communigator. This cookie is used to store UTM values for the session.UTM values are specific text strings that are appended to URLs that allow Communigator to track the URLs and the UTM values when they get clicked on

Cookie	Duration	Description
__cf_bm	30 minutes	Cloudflare set the cookie to support Cloudflare Bot Management.
_EDGE_S	session	Bing sets this cookie to display map content using Bing Maps.
_EDGE_V	1 year 24 days	Bing sets this cookie to display map content using Bing Maps.
li_gc	5 months 27 days	Linkedin set this cookie for storing visitor's consent regarding using cookies for non-essential purposes.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
sp_landing	1 day	The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
sp_t	1 year	The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
TawkConnectionTime	session	Tawk.to, a live chat functionality, sets this cookie. For improved service, this cookie helps remember users so that previous chats can be linked together.
twk_idm_key	session	Tawk set this cookie to allow the website to recognise the visitor in order to optimize the chat-box functionality.

SQL Server 2019 is SQL for big data

Contact Grey Matter

Mary Branscombe

Managing change in your business: Preparing for Generative AI in the workplace

Intel oneAPI 2024.1 A Milestone Release

ISV Partner Day Shortlisted for CRN Sales & Marketing Award

Microsoft 365 and Azure Security Tools: Microsoft Intune

About

Solutions

Vendors

Certifications

Select Your Region

SQL Server 2019 is SQL for big data

Contact Grey Matter

Mary Branscombe

Related News

Managing change in your business: Preparing for Generative AI in the workplace

Intel oneAPI 2024.1 A Milestone Release

ISV Partner Day Shortlisted for CRN Sales & Marketing Award

Microsoft 365 and Azure Security Tools: Microsoft Intune

Select Your Region