2 AWS Certification for Data Science | Is it Mandatory?

Console management

This connection method uses the console port on a device to configure it. Physical access to the device is required when directly connected to a workstation (laptop or desktop) computer. In a data center environment where there are many devices within one or more racks, console ports are typically connected to a console server to aggregate the console connections in the rack.

A single console server usually provides enough ports to manage all of the devices in a rack. There are several third-party manufacturers to choose from that offer console servers. These devices offer an Ethernet NIC port to allow an uplink to an Ethernet network, such as in-band or OOB, as shown in Figure 2. Depending on the console server model, multiple network administrators can use application layer protocols such as SSH, Telnet, HTTP, and others, to log in to the console server and access the console ports of the devices attached.

Data science engineering is a field that combines the skills of data science with the principles and practices of software engineering. It involves designing, building, and maintaining systems that are able to collect, process, and analyze large amounts of data in order to extract valuable insights and knowledge.

Data science engineers typically have a strong background in computer science and programming, as well as expertise in statistical analysis and machine learning. They are responsible for building and maintaining the infrastructure and systems that enable data scientists to work effectively, including data pipelines, data lakes, and data warehouses. They may also be involved in the design and implementation of machine learning models, as well as the deployment of those models into production environments.

Overall, the role of a data science engineer is to bridge the gap between data science and software engineering, bringing together the skills and knowledge needed to build and maintain large-scale data systems and to apply data-driven techniques to solve real-world problems.

AWS Certification for Data Science

There are several Amazon Web Services (AWS) certification options related to data science. These certifications demonstrate proficiency in various aspects of data science and cloud computing, and can be valuable for professionals working in the field. Here are some of the options

  • AWS Certified Machine Learning – Specialty: This certification is designed for professionals who have experience designing, implementing, and maintaining machine learning solutions on the AWS platform. It covers topics such as machine learning concepts, AWS services for machine learning, and machine learning workloads on the AWS platform.
  • AWS Certified Big Data – Specialty: This certification is for professionals who have experience working with big data and want to demonstrate their skills in using AWS technologies to process, store, and analyze large datasets. It covers topics such as big data concepts, AWS services for big data, and design and implementation of big data solutions on the AWS platform.
  • AWS Certified Data Analytics – Specialty: This certification is for professionals who have experience working with data analytics and want to demonstrate their skills in using AWS technologies to process, store, and analyze data. It covers topics such as data analytics concepts, AWS services for data analytics, and design and implementation of data analytics solutions on the AWS platform.
  • AWS Certified Solutions Architect – Associate: While not specifically focused on data science, this certification is relevant for professionals working in the field as it covers the design and implementation of solutions on the AWS platform, including data processing and analytics. It is suitable for professionals who have experience designing and deploying cloud infrastructure.

Is AWS Certificate for Data Science Recommended?

Whether or not an AWS certification in data science is recommended for you depends on your goals and circumstances. Here are a few factors to consider:

  • Are you looking to build or improve your skills in data science and cloud computing? AWS certifications can be a good way to learn new technologies and best practices, and to demonstrate your expertise to potential employers.
  • Do you want to advance your career in data science? An AWS certification may be viewed positively by employers and could potentially lead to new job opportunities or salary increases.
  • Do you have the necessary experience and knowledge to successfully complete the certification exams? AWS recommends that candidates have at least one year of hands-on experience with the relevant technologies before attempting the certification exams. If you do not have this level of experience, you may want to gain more experience before pursuing a certification.

Microsoft Certified: Azure Data Scientist Associate
AWS Certified Machine Learning – Specialty
Google Cloud Certified: Professional Machine Learning Engineer
I use this article to compare these certifications under 8 categories:

Ease of preparation
Affordability
Exam experience
Challenge the Data Scientist in you
Scope of improvement post exam
Post certification benefits
Which one should you take?
Where would I use each?
Now that I’ve revealed the whole content, I would also like to mention that these cloud providers and exam providers keep changing their exam content and format at intervals they find fit. So, some of the points I am providing now in July 2021 might not apply if you’re reading this too far out. Still, I hope some timeless points in this comparison help you make informed decisions.

Ease of preparation:
The first step to taking these certification is learning about what these providers have to offer in end-to-end ML solutions.

So it comes down to how easy / accessible is it, to gain provider specific knowledge (theory and hands-on) on data pre-processing, exploratory analysis, modeling, deployment and operations. Azure was the most accessible as they provide a free learning path which is very detailed, has everything you need to know in one place and also lets you spin up assets to try out their offerings. They also offer free official practice tests.

AWS would be the next, where they provide this exam readiness course, which serves as an introduction and overview, but I definitely wouldn’t say is exhaustive and a single stop for the exam. To be ready for the exam, requires in addition to strong ML basics, thorough knowledge on the ocean of cloud ML offerings AWS has. And being the oldest cloud layer, the ocean is wide and deep. Especially, the data engineering offerings took me a while to wrap my head around the number of alternate instances and services being built for the same overlapping use cases, with only minute differences.

For this, one might use external paid courses like the one offered by SunDog Education, but also make sure to go through the developer documentation across all offerings from AWS website. A trick I used to make my life easier with the developer docs was to use the Edit -> Speech -> Start Speaking (or similar functionality) provided by most browsers. This worked out for me as I can remember stuff I hear better than the ones I read.

GCP was the hardest to prepare for since the certification itself is very new and was also in Beta mode a while ago. If you look on their webpage for this exam, they seem to be providing a learning path, but it only points you to several external links and courses, a lot of which are paid. And hence, definitely not a single stop resource. There is this Coursera course which might serve as a single stop resource, but I did not try it as I was not willing to spend for it (didn’t want to rush through the 7 day free trial either), but people willing to spend can try it out and let me and others know if it was useful.

So how did I prepare for the GCP ML exam? I relied on a lot of Medium blogs from past test takers. I should say that I did get side tracked from some of the blogs from Beta exam takers, since the beta exam ground and the current exam group happen to be different. I came across Sathish VJ’s blog and used that as a guide to land at initial developer docs and navigate to related developer docs from there. And used the same Edit -> Speech -> Start Speaking technique again.

I can say this blog was a very good outline, but some of the links it points to may not be valid since GCP seems to keep rebranding stuff like AI Platform to Vertex AI etc. So whatever you read, you’ll have to read with the old branding and new branding in mind as it’s not clear which one is going to be asked about in the exam. GCP has an opportunity to improve themselves in this regard and provide a better exam readiness course, which is possibly also updated with each of their rebranding. I did also have access to the partner learning provided by GCP for a short period of 2 weeks through my employer, but that can use some restructuring too as it has a lot of repetitive information especially around AI Platform Pipelines and Kubeflow and is not organized very well under reflective headings and sections, in my opinion.

The affordability is as follows: Azure > GCP > AWS

While Azure costs about $100, GCP cost $120 and AWS cost $300 (this is all before tax). And one should consider validity along with cost too. While both Azure and GCP ML certifications are valid for 2 years, AWS is valid for 3 years. Even with that one extra year, AWS costs 2x of Azure and a little above 1.5x of GCP on a per year cost.

This is when we are considering only the ML exams against each other. But, one more thing about AWS is that, if you can take another cheaper AWS exam first, you get a 50% off for the ML Specialty exam and also get a free practice test. And that can bring the cost and investment on this single exam down a bit.

Exam Experience:
This includes experience during the duration of the exam and up to the point you receive results.

Both Azure and AWS are 3 hour long exams with Azure having a varying number of questions (up to 80) and AWS having a fixed 65. The exam provider I used for both is PearsonVue and while I noticed a calculator being present for Azure, did not find it for AWS and did not end up requiring it either. GCP has 60 questions and a duration of 2 hours and is provided through the test provider Kryterion. Between PearsonVue and Kryterion, I liked Kryterion’s UX better.

One might be wondering how someone can manage GCP when it has nearly the same questions as AWS but provides an hour less. I would say GCP is ideally timed since even 2 hours is a lot of time to be looking non-stop into a system screen (with even looking away from screen being looked at as a sign of cheating in remote-proctored exams during the pandemic). Besides, for both AWS and Azure I ended up having half an hour and an hour extra respectively. GCP was the one I ended right on time