Etiq toolkit is a unified, centralized way to set-up checks & tests at different stages in the pipeline across pre & post-production with out-of-the box tests as well as available customization.
The Etiq toolkit helps identify issues & test machine learning algorithms to prevent accuracy loss in production, reduce time it takes to validate a model and transition it from prototype to production-level pipelines, and prevent operational issues downstream arising from poorly functioning or misunderstood ML models.
Pain points & benefits
Some things Etiq can help with:
- Data issue upstream: it’s not just an easy way to set-up tests for these issues, but it’s also the peace of mind of knowing whether these data tests passed later on in the pipeline (at validation or production stages).
- See your models go into production sooner. Shared tests across data scientists and data engineers means everyone is looking at the same thing!
- Issues such as bias, sensitivity, accuracy for specific segments, unobserved data leakage, and so forth yield accuracy losses in production that in a good case scenario means that the model is not performing as well as it could be. Test for these issues with out of the box tests and get peace of mind.
- Getting a lot of queries from customer ops teams or your business stakeholders about why your models?. Cut down the amount of queries through automated reports where they can check directly why the model is doing what it’s doing. Get your time back to focus on building models.
- Comprehensive analysis of datasets and models to identify: Data inconsistencies (e.g. missing values, changing features, new features), Data Leakage, Accuracy Issues, Robustness, Sensitivity analysis, Drift, Bias, Anomalies.
- Customize definitions to your use case and preferences. Integrate open source algorithms into your testing and how you define issues. To customize the thresholds use a config file that can be reused across the pipeline.
- Identify which segments are impacted by the issues found. An interface makes it easy to view the issues and affected segments and find the root cause of the issue. It finds the segments impacted automatically but you can also pre-set your own segments to check for.
- Version control and team sharing included. All your tests and checks stored for you to retrieve whenever needed. Make it accessible to the rest of your team, team share without having to exchange jupyter notebooks. Set tests at exploration stage that will be checked when putting the model into production!
Super easy to set-up and use:
- Log your pipeline as they would via any other logging mechanic
- Retrieve results either in your notebook or any python-based IDE you use or via the dashboard
- Access those results whenever you want to and share them
- Test library available for download to check out the functionality - the demo version includes one error type area (bias) you can check out to see if you like the toolkit
- On-prem deployment so that your data never leaves your set-up.
There are two integration points for input and output:
- An API via which the user logs his models
- An API via which the user can use to integrate information from the Etiq platform into.
The diagram below shows a high level architecture of the integration and main user interaction points.
- Available via a SaaS platform. Test results and checks are computed on the user’s device and stored in Etiq’s AWS instance. Only results are stored outside the user’s infrastructure.
- AWS Marketplace. The Etiq platform is deployed on the user’s AWS infrastructure.
- Docker image. In the eventuality that the user prefers a deployment mechanic on-prem, that can be accommodated.