We connect through long term projects or short-term engagements, developing and supporting the full life cycle of cloud and data solutions or building and optimizing a platform: we can help.
With every consultant, you get the knowledge of entire Smartworkz at your disposal to develop state-of-the-art data and cloud platforms.
Audit and Second Opinion
Not sure if you made the right choice? Together with our experts, we are happy to help you with a second opinion.
Clean code for data scientists
Data Scientists increasingly operate in a world that extends beyond their local ‘island’ Jupyter notebooks. There is a need for the business to include the trained models in the production environment of the company. The code starts a long life outside the notebook, where it will be changed and maintained by other data scientists.
With this transition, different requirements are suddenly imposed on the code. Where the goal was first to develop a machine learning model that is as accurate as possible, it now appears that the code cannot simply be included in a production environment or is not so well maintained by others. Data or DevOps engineers use their expertise to help with this, but these disciplines do not always speak the same language and often have a different view of how software should be built.
In this hands-on workshop especially for data scientists, you will be taken into the world of clean coding and we will look at problems that we have seen in our experience with the production of machine learning models. We do this on the basis of a series of short refactoring kata’s, always preceded by a piece of theory. As a result of this workshop, the transition from local data science code to a production-worthy product will become easier and thus deliver business value faster.
No more messy data
Dirty data is expensive. Organizations believe that an average of 25 per cent of their data is inaccurate, a factor that affects the bottom line. This is well articulated in the
1-10-100 rule: If it costs you $1 to prevent a data error, it will cost you $10 to correct it afterwards - and $100 if this error causes a problem.
We build automated tests in your data pipeline so that no drop of dirty data can flow in.
Within approximately 2 weeks we will, with the support of your domain specialists:
build an automatic data testing framework into your existing data pipeline;
fully program one data pipeline with automated tests;
and provide you with all the necessary knowledge in workshops, so that you can (test) automate the rest of your data pipelines.
Our approach ensures that companies prevent data pollution at the source, namely in the data pipeline.
Amazon SageMaker is a fully managed service that enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. This workshop will guide you through using the numerous features of SageMaker.
You’ll start by creating a SageMaker notebook instance with the required permissions. You will then interact with SageMaker via sample Jupyter notebooks, the AWS CLI, the SageMaker console, or all three. During the workshop, you’ll explore various data sets, create model training jobs using SageMaker’s hosted training feature, and create endpoints to serve predictions from your models using SageMaker’s hosted endpoint feature.