This guide provides detailed steps to create, train, and deploy a prediction model for classifying various document types on the Neutrinos AI Hub platform. With its intuitive interface and advanced capabilities, Neutrinos AI Hub simplifies document prediction while offering robust configurations.
Step 1: Logging into the Platform
- Log in to Neutrinos AI Hub using the username and password provided by your organization.
- Once logged in, navigate to the Prediction Tab on the dashboard.
Step 2: Choosing the Right Prediction Type
The platform provides flexibility for different prediction needs:
- Document Tab: Select this tab for training models that classify document types.
- Text Prediction Tab: Use this for predictive models focusing on textual data, such as fraud detection or churn analysis.
Step 3: Adding a New Prediction Model
- Click the Add option to create a new model.
- The platform displays an overview of the training workflow, explaining how the process works.
Preparing Your Training Data
- If documents are already categorized offline, upload them in bulk to the platform.
- Alternatively, upload uncategorized documents and use the platform to tag and annotate them.
Why Categorize Documents?
Categorizing documents ensures the model understands the distinctions between document types. Proper tagging and categorization improve the accuracy and reliability of predictions.
Step 4: Uploading Files for Training
- Minimum Requirement: Upload at least 25 files per document category for medium accuracy.
- Best Practice: Upload more than 100 files per category to achieve higher prediction accuracy.
Example:
If you are training the model to classify documents into categories like “Invoices,” “Purchase Orders,” and “Receipts,” upload 25–100+ files for each category.
Step 5: Configuring Page Splitting
Some documents may consist of multiple pages that require individual classification. The platform provides the following options:
- Yes: The system splits and classifies pages individually.
- No: Use this if your documents are single-page or do not require splitting.
Example Use Case:
For multi-page documents like ID cards with separate front and back pages, enable splitting to classify them separately. Otherwise, you can group pages under a single document type.
Step 6: Setting Up the Model Details
Provide the necessary details for the model, including:
- Model Name: A descriptive name for the model.
- Description: A short summary of the model’s purpose.
Training Rules:
-
Feedback Loop Configuration:
Why It Matters:
Feedback loops help improve model accuracy by identifying areas for re-training. This ensures the model evolves with user feedback and achieves higher reliability over time.
- Always: Tag all inference requests for review, approval, or re-training.
- Never: Skip tagging (ideal for high-performing models in production).
- Confident: Tag requests only when confidence is below a specific threshold.
- Retention Settings:
- Define the retention period for data stored on Neutrinos Cloud.
- Note: Tagged data will remain until manually processed, ensuring compliance with review workflows.
- Document Merging:
- Combine multi-page classifications into a single file when required. For example, group “EID Front” and “EID Back” into one document.
- Advanced Configurations:
- Improve document quality using features like:
- Enhancing image contrast.
- Rotating or resizing.
- Removing watermarks.
- Mirroring or flipping images.
- Improve document quality using features like:
Step 7: Starting the Training Process
- Click Start Training to initiate the process.
- The platform evaluates multiple models using an 80:20 data split for training and testing.
- Platform determines the top 7 models suited for the incoming data and batch size and trains, tests based on weighted average score.
- It automatically selects the best-performing model based on metrics like precision and F1 score and more.
Training Duration:
Training time depends on the number of document categories, the volume of data, and the selected quality level. It typically takes 15 minutes to 2 hours.
Step 8: Viewing and Testing the Trained Model
- Once training is complete, navigate to the Prediction List and select the trained model.
- Review model performance metrics, including:
- Confidence: Indicates how certain the model is about its predictions.
- Precision and F1 Score: Key indicators of model accuracy.
Testing Options:
- Single Test: Test individual documents for predictions.
- Batch Test: Test multiple documents simultaneously to validate bulk predictions.
Step 9: Reviewing Tagged Inference Requests
The Review Hub allows users to:
- Review inference requests tagged based on configured feedback rules.
- Accept predictions or request re-training to improve accuracy further.
Step 10: Integrating the Model
- Go to the Integrations section to find APIs for:
- Single inference requests.
- Batch inference requests.
- Schedule jobs for bulk document predictions. The platform supports integrations with:
- CMIS-compliant DMS.
- Amazon S3.
- SFTP.
- Network Folders.
- and more
Example:
Set up a scheduled job to classify documents stored in an S3 bucket and automatically organize results into folders.
Key Features of Neutrinos AI Hub
- Multi-Model Training: The platform trains multiple models and selects the best performer for deployment.
- Customizable Retention Policies: Ensures compliance with data privacy and retention regulations.
- Advanced Document Enrichment: Offers features like watermark removal, contrast enhancement and more to enhance document quality.
- Seamless Integrations: Supports real-time and batch predictions via APIs and popular data sources.
Conclusion
Neutrinos AI Hub simplifies the process of training, testing, and deploying document prediction models. By leveraging its feedback loops, advanced configurations, and robust integration capabilities, you can create powerful, scalable solutions tailored to your document classification needs.
Tags: Document Prediction, AI Hub, Machine Learning, Neutrinos, Document Classification