Gaining performance insights using the Intel Advisor Python API

Blog|by James Roberts|3 September 2018

According to a recent article in The Economist, Python is fast becoming the world’s most popular coding language. And the State of the Developer Ecosystem in 2018 report from JetBrains ranked Python as the most popular language that developers have started to learn/continued to learn in the last 12 months.

Python offers itself as a general-purpose language, relatively easy to use with a huge user base with large amounts of documentation. Its straightforward syntax and use of indented spaces make it easy to learn, read and share. Python is the language of choice for fast prototyping, supports asynchronous programming and frameworks.

With thanks to Intel’s magazine “The Parallel Universe“, we can take a close look at how the Intel® Advisor Python API in Intel® Parallel Studio XE provides a way to generate program statistics and reports to help you optimise the performance of your system.

Getting good data to make code tuning decisions

Good design decisions are based on good data:

What loops should be threaded and vectorised first?
Is the performance gain worth the effort?
Will the threading performance scale with higher core counts?
Does this loop have a dependency that prevents vectorisation?
What are the trip counts and memory access patterns?
Have you vectorised efficiently with the latest Intel® Advanced Vector Extensions 512 (Intel® AVX-512) instructions? Or are you using older SIMD instructions?

Intel Advisor is a dynamic analysis tool that’s part of Intel Parallel Studio XE, Intel’s comprehensive tool suite for building and modernising code. Intel Advisor answers these questions―and many more. You can collect insightful program metrics on the vectorisation and memory profile of your application. And, besides providing tailored reports using the GUI and command line, Intel Advisor now gives you the added flexibility to mine a collected database and create powerful new reports using Python.

When you run Intel Advisor, it stores all the data it collects in a proprietary database that you can now access using a Python API. This provides a flexible way to generate customised reports on program metrics. This article will describe how to use this new functionality.

Getting started

To get started, you need to setup the Intel Advisor environment. (For this article, all the scripts were run on Linux, but the Intel Advisor Python API also supports Windows.)

Source: Intel

Next, to set up the Intel Advisor data, you need to run some collections. Some of the program metrics require additional analysis such as tripcounts, memory access patterns, and dependencies.

Source: Intel

To run a map or dependencies collection, you need to specify the loops that you want to analyse. You can find this information using the Intel Advisor GUI or by doing a command-line report.

Source: Intel

Finally, you will need to copy the Intel Advisor reference examples to a test area.

Source: Intel

Note that all the scripts we ran for this article use the Python that currently ships with Intel Advisor on Linux. The standard distributions of Python should also work just as well.

Using the Intel Advisor Python API

The reference examples provided are just small set of the reporting that’s possible using this flexible way to access your program data. You could use the columns.py example to get a list of available data fields. For example, you could see the metrics in Table 1 after running a basic survey collection.

Table 1. Sample survey metrics

Source: Intel

Intel Advisor Python API in Action

Let’s walk through a simple example that shows how to collect some powerful metrics using the Intel Advisor Python API. The first step is to import the Intel Advisor library package.

Source: Intel

You then need to open the Intel Advisor project that contains the result you’ve collected.

Source: Intel

You also have the option of creating a project and running collections. (In the example below, we’re just doing an open_project.) In this example, we access data from the memory access pattern (MAP) collection. We do this using the following line of code:

Source: Intel

Once we’ve loaded this data, we can loop through the table and gather cache utilisation statistics. We then print out the data we’ve collected:

Source: Intel

Intel Advisor Python API Advanced Topics

The examples provided as part of the Intel Advisor Python API give you a blueprint for writing your own scripts. Table 2 shows some of these advanced capabilities.

Table 2. Intel Advisor Python API advanced capabilities

Source: Intel

Here are some highlights of various examples. We are constantly adding to the list of examples.

Generate a combined report showing all data collected:

Source: Intel:

Generate an html report:

Source: Intel

You can generate a Roofline HTML chart (Figure 1) with this code:

Source: Intel

You must run the roofline.py script with an external Python command and not advixe-python. It currently only runs on Linux. It also requires the additional libraries numpy, pandas, and matplotlib to be installed. Use this code to generate cache simulation statistics:

Source: Intel

You can see the results obtained from the cache model in Table 3.

Table 3. Cache model results

Source: Intel

Case Study: Vectorisation Comparison

In this case study, we create a Python script that can compare the vectorisation of a given loop when compiled with different compiler options.

Step 1: Compile Code with Different Optimization Flags
First, compile the app with different options. In this example, we use the Intel® C++ Compiler (but Intel Advisor works at the binary level, so any compiler should work). In the first case, we are compiling without optimisation using the compiler option -O0. The second case uses full optimisation -O3.

Source: Intel

Step 2: The Python Code
The script is very simple. First, get some arguments from the command-line. If they are being passed an Intel Advisor project, then use the data contained in the project. Otherwise, do an Intel Advisor survey run. Once the survey runs complete, decode
the assembly for the loops and print the instructions of the two loops side-by-side. The main function in our Python code is named get_formatted_asm. This function is able to access the Intel Advisor database and decode the assembly for our loops. It can also check whether the assembly code is using vector instructions, as well as how fast the loop executed.

Source: Intel

Step 3: Run the Python Script

Source: Intel

Step 4: Recompile with AVX2 Vectorisation

Now let’s try a further optimisation. Since our processor supports the AVX2 instruction set, we are going to tell the compiler to generate AVX2. (You should note that this generally not what the compiler with generate by default.)

Source: Intel

Step 5: Rerun the Comparison

Source: Intel

You can see that the assembly code now uses YMM registers instead of XMM, doubling the vector length and giving a 2X speedup.

Results
The gains we made by optimising and by using the latest vectorisation instruction set were significant:
• No optimisation of -O0: 45.148 seconds
• Optimising -O3: 4.403 seconds
• Optimising and AVX2 –O3 –AVX2: 2.056 seconds

Maximizing System Performance

On modern processors, it’s crucial to both vectorise and thread software to realise the full performance potential of the processor. The new Intel Advisor Python API in Intel Parallel Studio XE provides a powerful way to generate program statistics and reports that can help you get the most performance out of your system. The examples outlined in this article illustrate the power of this new interface. Based on your specific needs, you can tailor and extend these examples. Intel is actively gathering feedback on the Intel Advisor Python API. If you’ve tried it and found it useful, or would like to provide feedback, send email to: vector_advisor@intel.com

[This article written by Kevin O’Leary, Technical Consulting Engineer, and Egor Kazachkov, Senior Software Developer, Intel Corporation, was first published in Issue 31 of The Parallel Universe magazine and re-published with permission.]

If you want to find out more about Intel Software, contact our Intel specialists: developer@greymatter.com or call +44 (0)1364 655 180.

We are hosting a webinar on 30 April about how you can use parallelism and profiling to improve the performance of Python code. Find out more here.

3 September 2018 | Blog

Contact Grey Matter

If you have any questions or want some extra information, complete the form below and one of the team will be in touch ASAP. If you have a specific use case, please let us know and we'll help you find the right solution faster.

By submitting this form you are agreeing to our Privacy Policy and Website Terms of Use.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Analytics" category.
cookielawinfo-checkbox-functional	1 year	The GDPR Cookie Consent plugin sets the cookie to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Necessary" category.
cookielawinfo-checkbox-others	1 year	Set by the GDPR Cookie Consent plugin, this cookie stores user consent for cookies in the category "Others".
cookielawinfo-checkbox-performance	1 year	Set by the GDPR Cookie Consent plugin, this cookie stores the user consent for cookies in the category "Performance".
CookieLawInfoConsent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
csrftoken	1 year	This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks
JSESSIONID	session	New Relic uses this cookie to store a session identifier so that New Relic can monitor session counts for an application.
SRCHD	1 year 24 days	Bing sets this cookie to display map content using Bing Maps.
SRCHUID	1 year 24 days	Bing sets this cookie to display map content using Bing Maps.
SRCHUSR	1 year 24 days	Bing sets this cookie to display map content using Bing Maps.
viewed_cookie_policy	1 year	The GDPR Cookie Consent plugin sets the cookie to store whether or not the user has consented to use cookies. It does not store any personal data.

Cookie	Duration	Description
_an_uid	7 days	No description available.
_cfuvid	session	Description is currently not available.
6suuid	1 year 1 month 4 days	No description available.
AN	1 month	No description available.
AS	session	No description available.
debug	never	No description available.
ebEventToTrack	1 month	No description available.
eblang	1 year	No description available.
gm_country_code	7 days	Description is currently not available.
guest	1 month	No description available.
JOTFORM_SESSION	1 month	No description available.
loglevel	never	No description available.
receive-cookie-deprecation	1 year 1 month 4 days	Description is currently not available.
SP	session	Description is currently not available.
SRCHHPGUSR	1 year 24 days	No description available.
SS	session	Description is currently not available.
stableId	1 year	Description is currently not available.
TESTCOOKIESENABLED	1 minute	Description is currently not available.
userReferer	1 month	No description available.
VISITOR_PRIVACY_METADATA	6 months	Description is currently not available.
zoom	never	No description available.

Cookie	Duration	Description
_SS	session	Bing sets this cookie to collect information on how visitors behave on multiple websites and to understand how they access the website, to provide relevant ads.
ANONCHK	10 minutes	The ANONCHK cookie, set by Bing, is used to store a user's session ID and verify ads' clicks on the Bing search engine. The cookie helps in reporting and personalization as well.
bcookie	1 year	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser IDs.
bscookie	1 year	LinkedIn sets this cookie to store performed actions on the website.
fr	3 months	Facebook sets this cookie to show relevant advertisements by tracking user behaviour across the web, on sites with Facebook pixel or Facebook social plugin.
guest_id	1 year 1 month	Twitter sets this cookie to identify and track the website visitor. It registers if a user is signed in to the Twitter platform and collects information about ad preferences.
IDE	1 year 24 days	Google DoubleClick IDE cookies store information about how the user uses the website to present them with relevant ads according to the user profile.
li_sugr	3 months	LinkedIn sets this cookie to collect user behaviour data to optimise the website and make advertisements on the website more relevant.
mgref	1 year	This cookie is set by Eventbrite to deliver content tailored to the end user's interests and improve content creation. It is also used for event-booking purposes.
muc_ads	1 year 1 month 4 days	Twitter sets this cookie to collect user behaviour and interaction data to optimize the website.
MUID	1 year 24 days	Bing sets this cookie to recognise unique web browsers visiting Microsoft sites. This cookie is used for advertising, site analytics, and other operations.
personalization_id	1 year 1 month 4 days	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
SUID	12 hours	Google Analytics sets this cookie to collect data on user preferences and/or interaction with web campaign content (Microsoft).
test_cookie	15 minutes	doubleclick.net sets this cookie to determine if the user's browser supports cookies.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.
VISITOR_INFO1_LIVE	5 months 27 days	YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface.
YSC	session	Youtube sets this cookie to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt-remote-device-id	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt.innertube::nextId	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
_clck	1 year	Microsoft Clarity sets this cookie to retain the browser's Clarity User ID and settings exclusive to that website. This guarantees that actions taken during subsequent visits to the same website will be linked to the same user ID.
_clsk	1 day	Microsoft Clarity sets this cookie to store and consolidate a user's pageviews into a single session recording.
_fbp	3 months	Facebook sets this cookie to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising after visiting the website.
_ga	1 year 1 month 4 days	Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_ga_*	1 year 1 month 4 days	Google Analytics sets this cookie to store and count page views.
_gat_UA-*	1 minute	Google Analytics sets this cookie for user behaviour tracking.
_gcl_au	3 months	Google Tag Manager sets the cookie to experiment advertisement efficiency of websites using their services.
_gd_session	4 hours	This cookie is used for collecting information on users visit to the website. It collects data such as total number of visits, average time spent on the website and the pages loaded.
_gd_svisitor	1 year 1 month 4 days	This cookie is set by the Google Analytics. This cookie is used for tracking the signup commissions via affiliate program.
_gd_visitor	1 year 1 month 4 days	This cookie is used for collecting information on the users visit such as number of visits, average time spent on the website and the pages loaded for displaying targeted ads.
_gid	1 day	Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously.
_s	1 year	This cookie is associated with Shopify's analytics suite.
ajs_anonymous_id	never	This cookie is set by Segment to count the number of people who visit a certain site by tracking if they have visited before.
ajs_group_id	never	This cookie is set by Segment to track visitor usage and events within the website.
ajs_user_id	never	This cookie is set by Segment to help track visitor usage, events, target marketing, and also measure application performance and stability.
AnalyticsSyncHistory	1 month	Linkedin set this cookie to store information about the time a sync took place with the lms_analytics cookie.
CLID	1 year	Microsoft Clarity set this cookie to store information about how visitors interact with the website. The cookie helps to provide an analysis report. The data collection includes the number of visitors, where they visit the website, and the pages visited.
CONSENT	2 years	YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data.
ln_or	1 day	Linkedin sets this cookie to registers statistical data on users' behaviour on the website for internal analytics.
MR	7 days	This cookie, set by Bing, is used to collect user information for analytics purposes.
MUIDB	1 year 24 days	Bing sets this cookie to determine how the user uses the website and any advertising that the end user may have seen before visiting the said website.
SM	session	Microsoft Clarity cookie set this cookie for synchronizing the MUID across Microsoft domains.
vuid	1 year 1 month 4 days	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos on the website.
wow.anonymousId	1 year 1 month 4 days	This is a analytic cookie used to store anonymous visitor ID. It tracks the visitor uniquely between visits.
wow.session	20 minutes	This cookie is set by the provider Communigator.This cookie is used to track the Internet Information Services(IIS) session state.
wow.utmvalues	20 minutes	This cookie is from Communigator. This cookie is used to store UTM values for the session.UTM values are specific text strings that are appended to URLs that allow Communigator to track the URLs and the UTM values when they get clicked on

Cookie	Duration	Description
__cf_bm	30 minutes	Cloudflare set the cookie to support Cloudflare Bot Management.
_EDGE_S	session	Bing sets this cookie to display map content using Bing Maps.
_EDGE_V	1 year 24 days	Bing sets this cookie to display map content using Bing Maps.
li_gc	5 months 27 days	Linkedin set this cookie for storing visitor's consent regarding using cookies for non-essential purposes.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
sp_landing	1 day	The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
sp_t	1 year	The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
TawkConnectionTime	session	Tawk.to, a live chat functionality, sets this cookie. For improved service, this cookie helps remember users so that previous chats can be linked together.
twk_idm_key	session	Tawk set this cookie to allow the website to recognise the visitor in order to optimize the chat-box functionality.

Gaining performance insights using the Intel Advisor Python API