Upload Files and Parse Files
This comprehensive guide will walk you through the process of uploading files and parsing them with UnDatasIO. Whether you’re working with CSV files, JSON data, Excel spreadsheets, or other formats, this guide covers everything you need to know.
Overview
Section titled “Overview”UnDatasIO provides a powerful and intuitive interface for uploading and parsing various file formats. The process involves three main steps:
- Upload your file to the platform
- Configure parsing options based on your file type
- Parse and review the results
Supported File Formats
Section titled “Supported File Formats”UnDatasIO supports a wide range of file formats:
Text-Based Formats
Section titled “Text-Based Formats”- CSV (.csv) - Comma-separated values
- TSV (.tsv) - Tab-separated values
- JSON (.json) - JavaScript Object Notation
- XML (.xml) - Extensible Markup Language
- TXT (.txt) - Plain text files
Spreadsheet Formats
Section titled “Spreadsheet Formats”- Excel (.xlsx, .xls) - Microsoft Excel files
- Google Sheets - Exported as CSV or Excel
Document Formats
Section titled “Document Formats”- PDF (.pdf) - Portable Document Format
- Word (.docx, .doc) - Microsoft Word documents
Database Formats
Section titled “Database Formats”- SQL (.sql) - SQL dump files
- Database exports - Various database export formats
Step-by-Step Guide
Section titled “Step-by-Step Guide”Step 1: Access the Upload Interface
Section titled “Step 1: Access the Upload Interface”- Log in to your UnDatasIO account at app.undatasio.com
- Navigate to the main dashboard
- Click the “Upload Files” button or drag files directly to the upload area
Step 2: Upload Your File
Section titled “Step 2: Upload Your File”Method 1: Drag and Drop
Section titled “Method 1: Drag and Drop”- Open your file explorer
- Select the file you want to upload
- Drag the file to the upload area on the UnDatasIO interface
- Release to start the upload
Method 2: File Browser
Section titled “Method 2: File Browser”- Click “Choose File” or “Browse”
- Navigate to your file location
- Select the file
- Click “Open” to start the upload
Method 3: Multiple File Upload
Section titled “Method 3: Multiple File Upload”- Hold Ctrl (Windows) or Cmd (Mac) to select multiple files
- Drag all selected files to the upload area
- Release to upload all files simultaneously
Step 3: File Processing
Section titled “Step 3: File Processing”Once your file is uploaded, UnDatasIO will:
- Analyze the file content and format
- Detect the file type automatically
- Display a preview of the file content
- Suggest optimal parsing settings
Step 4: Configure Parsing Options
Section titled “Step 4: Configure Parsing Options”For CSV Files
Section titled “For CSV Files”Basic Settings:
-
Delimiter: Choose the character that separates values
- Comma (,)
- Semicolon (;)
- Tab (\t)
- Pipe (|)
- Custom delimiter
-
Header Row: Specify if the first row contains column names
- Check “Has Header” if your file has column names
- Uncheck if the first row contains data
-
Encoding: Select the file encoding
- UTF-8 (recommended for international characters)
- ISO-8859-1 (Latin-1)
- Windows-1252 (Windows default)
Advanced Settings:
- Quote Character: Character used to enclose text fields
- Escape Character: Character used to escape special characters
- Skip Empty Lines: Remove rows with no data
- Trim Whitespace: Remove leading/trailing spaces
For JSON Files
Section titled “For JSON Files”Basic Settings:
-
Root Path: Specify the path to your data array/object
- Leave empty if data is at the root level
- Use dot notation (e.g., “data.users”) for nested structures
-
Schema Validation: Enable to validate against a JSON schema
- Upload a JSON schema file
- Or define schema inline
Advanced Settings:
- Pretty Print: Format JSON for better readability
- Max Depth: Limit the depth of nested objects
- Array Handling: Choose how to handle arrays
For Excel Files
Section titled “For Excel Files”Basic Settings:
-
Sheet Selection: Choose which worksheet to parse
- Select from available sheets
- Or specify sheet by name/number
-
Range: Specify the cell range to parse
- Use Excel notation (e.g., “A1:D100”)
- Leave empty to parse entire sheet
-
Header Row: Specify if the first row contains column names
Advanced Settings:
- Skip Rows: Number of rows to skip at the beginning
- Skip Columns: Number of columns to skip from the left
- Formula Evaluation: Whether to evaluate formulas
- Date Format: Specify date format for date columns
For Text Files
Section titled “For Text Files”Basic Settings:
- Delimiter: Character that separates values
- Line Separator: Character that separates lines
- Encoding: File encoding
Advanced Settings:
- Custom Parsing Rules: Define custom parsing patterns
- Regular Expressions: Use regex for complex parsing
- Multi-line Records: Handle records spanning multiple lines
Step 5: Parse and Review
Section titled “Step 5: Parse and Review”- Click “Parse File” to start the parsing process
- Wait for the parsing to complete
- Review the results in the preview panel
Understanding the Results
Section titled “Understanding the Results”Data Preview:
- First 10 rows of parsed data
- Column names and data types
- Data quality indicators
Parsing Statistics:
- Total rows processed
- Total columns detected
- Parsing errors (if any)
- Processing time
Data Quality Report:
- Missing values count
- Data type consistency
- Format validation results
Step 6: Handle Parsing Issues
Section titled “Step 6: Handle Parsing Issues”Common Issues and Solutions
Section titled “Common Issues and Solutions”Encoding Problems:
- Symptoms: Garbled characters, missing text
- Solution: Try different encoding options (UTF-8, ISO-8859-1, Windows-1252)
Delimiter Issues:
- Symptoms: All data in one column, incorrect column separation
- Solution: Check the actual delimiter in your file and select the correct option
Header Problems:
- Symptoms: Column names appear as data, or data appears as column names
- Solution: Toggle the “Has Header” option
Large File Issues:
- Symptoms: Slow processing, timeout errors
- Solution: Use streaming processing or split large files
Error Messages
Section titled “Error Messages”“Invalid file format”
- Check if the file extension matches the actual content
- Try manual format selection
- Verify the file isn’t corrupted
“Parsing failed”
- Review the error details
- Check file encoding and format
- Verify parsing configuration
“File too large”
- Use file compression
- Split the file into smaller chunks
- Contact support for large file processing
Step 7: Export or Process Further
Section titled “Step 7: Export or Process Further”Once parsing is successful, you can:
Export Options
Section titled “Export Options”- Download as CSV: Standard comma-separated format
- Download as JSON: Structured data format
- Download as Excel: Spreadsheet format
- Download as XML: Markup language format
Further Processing
Section titled “Further Processing”- Data Cleaning: Remove duplicates, handle missing values
- Data Transformation: Convert data types, format values
- Data Validation: Apply business rules and constraints
- Data Analysis: Generate statistics and insights
Advanced Features
Section titled “Advanced Features”Batch Processing
Section titled “Batch Processing”Process multiple files simultaneously:
- Upload multiple files at once
- Select all files in the batch
- Apply the same parsing configuration to all files
- Process all files together
- Download results as a single file or individual files
Template Management
Section titled “Template Management”Save and reuse parsing configurations:
- Configure parsing options for your file type
- Click “Save as Template”
- Name your template (e.g., “Customer Data CSV”)
- Use the template for future files of the same type
API Integration
Section titled “API Integration”Use the API for automated file processing:
// Upload file via APIconst response = await fetch("https://api.undatasio.com/v1/files/upload", { method: "POST", headers: { Authorization: "Bearer YOUR_API_KEY", }, body: formData,});
// Parse file via APIconst parseResponse = await fetch( `https://api.undatasio.com/v1/files/${fileId}/parse`, { method: "POST", headers: { Authorization: "Bearer YOUR_API_KEY", "Content-Type": "application/json", }, body: JSON.stringify({ format: "csv", delimiter: "comma", hasHeader: true, }), });Real-time Processing
Section titled “Real-time Processing”Set up webhooks for automated processing:
- Configure webhook endpoints
- Upload files via API or interface
- Receive notifications when parsing is complete
- Automatically process results in your application
Best Practices
Section titled “Best Practices”File Preparation
Section titled “File Preparation”-
Clean your data before uploading
- Remove unnecessary formatting
- Ensure consistent data types
- Handle missing values appropriately
-
Use appropriate file formats
- CSV for tabular data
- JSON for structured data
- Excel for complex spreadsheets
-
Check file encoding
- Use UTF-8 for international characters
- Ensure compatibility with your data source
Performance Optimization
Section titled “Performance Optimization”-
Optimize file size
- Compress large files
- Remove unnecessary columns
- Use efficient data formats
-
Use batch processing
- Process multiple files together
- Reuse parsing configurations
- Automate repetitive tasks
-
Monitor processing
- Check processing times
- Review error logs
- Optimize based on performance data
Data Quality
Section titled “Data Quality”-
Validate your data
- Check for missing values
- Verify data types
- Ensure data consistency
-
Handle errors gracefully
- Review parsing errors
- Fix data issues
- Re-parse as needed
-
Document your process
- Save parsing configurations
- Note any data transformations
- Keep audit trails
Troubleshooting
Section titled “Troubleshooting”Common Problems
Section titled “Common Problems”File won’t upload:
- Check file size limits
- Verify file format is supported
- Ensure stable internet connection
Parsing errors:
- Review file content for issues
- Check parsing configuration
- Try different encoding options
Slow processing:
- Optimize file size
- Use appropriate file format
- Consider batch processing
Getting Help
Section titled “Getting Help”If you encounter issues:
- Check the documentation for your specific file format
- Review error messages for specific guidance
- Contact support with detailed information about your issue
- Join the community for help from other users
Next Steps
Section titled “Next Steps”After successfully uploading and parsing your files:
- Explore data processing features to clean and transform your data
- Learn about API integration to automate your workflows
- Set up data pipelines for recurring processing tasks
- Implement data validation to ensure data quality
For more advanced topics, check out: