Dataset Get Status

Command Signature

php artisan dataset:get-status {dataset_id?}

Purpose

The dataset:get-status command monitors pending dataset generation processes by querying the TV Python API for status updates, updating dataset records in the Console database (gb_console), and triggering appropriate notifications to users and administrators. This command ensures real-time tracking of dataset processing states and handles completion workflows including batch job creation for specific viewpoint analysis.

Sequence Diagram

Step 1: Command Initialization and Dataset Retrieval

sequenceDiagram
    participant System
    participant StatusCommand as dataset:get-status
    participant HistoryRepo as WishlistDatasetHistoryRepository
    participant ConsoleDB[(gb_console.wishlist_dataset_histories)]
    participant Logger
    participant Slack
    
    Note over System,Slack: Command Initialization (Every 5 Minutes)
    
    rect rgb(200, 255, 200)
    Note right of System: Happy Case - Command Startup
    System->>StatusCommand: Execute Command
    StatusCommand->>Logger: Log command start with dataset_id parameter
    
    StatusCommand->>HistoryRepo: getPendingAndAnalyzingRecords(dataset_id?)
    HistoryRepo->>ConsoleDB: Query WHERE status IN ('Pending', 'Analyzing')
    Note right of HistoryRepo: Filter by dataset_id if provided
    ConsoleDB-->>HistoryRepo: Return datasets with Pending/Analyzing status
    HistoryRepo-->>StatusCommand: Return datasets with Pending/Analyzing status
    StatusCommand->>Logger: Log found datasets count
    end
    
    rect rgb(255, 200, 200)
    Note right of System: Error Handling
    rect rgb(255, 230, 230)
    alt Database Connection Error
        HistoryRepo->>Logger: Log database connection error
        HistoryRepo->>Slack: Send database error notification
    else No Pending Datasets
        StatusCommand->>Logger: Log no datasets to monitor
    else Invalid Dataset ID
        StatusCommand->>Logger: Log invalid dataset_id parameter
        StatusCommand->>Slack: Send invalid parameter notification
    end
    end
    end

Step 2: API Status Checking

sequenceDiagram
    participant StatusCommand as dataset:get-status
    participant AnalyzerAPI as TV Python API
    participant APIService as AnalyzerApiService
    participant Logger
    participant Slack
    
    Note over StatusCommand,Slack: Dataset Status Verification Process
    
    rect rgb(200, 255, 200)
    Note right of StatusCommand: Happy Case - Status Monitoring
    
    loop For each Pending Dataset
        StatusCommand->>Logger: Log dataset processing start
        
        rect rgb(200, 230, 255)
        Note right of StatusCommand: API Status Request
        StatusCommand->>APIService: prepareStatusRequest(datasetId)
        APIService->>APIService: validateDatasetId()
        APIService->>APIService: addAuthenticationHeaders()
        APIService-->>StatusCommand: Return formatted request
        
        StatusCommand->>AnalyzerAPI: getDatasetDetail(datasetId)
        AnalyzerAPI->>AnalyzerAPI: Validate request and retrieve dataset
        AnalyzerAPI-->>StatusCommand: Return current status and progress
        StatusCommand->>Logger: Log API response with status details
        end
    end
    end
    
    rect rgb(255, 200, 200)
    Note right of StatusCommand: Error Handling
    rect rgb(255, 230, 230)
    alt API Authentication Error
        AnalyzerAPI->>Logger: Log authentication failure
        AnalyzerAPI->>Slack: Send API authentication error
        AnalyzerAPI->>StatusCommand: Return authentication error
    else Dataset Not Found
        AnalyzerAPI->>Logger: Log dataset not found error
        AnalyzerAPI->>Slack: Send dataset not found notification
        AnalyzerAPI->>StatusCommand: Return not found error
    else API Service Error
        AnalyzerAPI->>Logger: Log API service error with response
        AnalyzerAPI->>Slack: Send API service error notification
        AnalyzerAPI->>StatusCommand: Return service error
    else API Timeout Error
        AnalyzerAPI->>Logger: Log API timeout error
        AnalyzerAPI->>Slack: Send timeout error notification
        AnalyzerAPI->>StatusCommand: Return timeout error
    end
    end
    end

Step 3: Status Comparison and Change Detection

sequenceDiagram
    participant StatusCommand as dataset:get-status
    participant HistoryRepo as WishlistDatasetHistoryRepository
    participant LogService as DatasetCreationLogService
    participant ConsoleDB[(gb_console)]
    participant Logger
    participant Slack
    
    Note over StatusCommand,Slack: Status Change Detection and Validation
    
    rect rgb(200, 255, 200)
    Note right of StatusCommand: Happy Case - Status Comparison
    
    rect rgb(200, 230, 255)
    Note right of StatusCommand: Status Change Detection
    StatusCommand->>StatusCommand: compareCurrentWithApiStatus()
    StatusCommand->>StatusCommand: validateStatusTransition()
    StatusCommand->>Logger: Log status comparison result
    
    alt Status Changed
        Note right of StatusCommand: Status Update Required
        StatusCommand->>HistoryRepo: transactionBegin()
        StatusCommand->>HistoryRepo: updateDatasetStatus(newStatus)
        HistoryRepo->>ConsoleDB: UPDATE wishlist_dataset_histories SET status
        ConsoleDB-->>HistoryRepo: Confirm status updated
        HistoryRepo-->>StatusCommand: Confirm status updated
        
        StatusCommand->>LogService: logStatusChangeEvent(dataset, oldStatus, newStatus)
        LogService->>ConsoleDB: INSERT INTO wishlist_dataset_creation_logs
        ConsoleDB-->>LogService: Return log record ID
        LogService-->>StatusCommand: Confirm event logged
        
        StatusCommand->>Logger: Log status change success
    else No Status Change
        Note right of StatusCommand: Status Unchanged
        StatusCommand->>StatusCommand: Continue monitoring
        StatusCommand->>Logger: Log no status change detected
    end
    end
    end
    
    rect rgb(255, 200, 200)
    Note right of StatusCommand: Error Handling
    rect rgb(255, 230, 230)
    alt Invalid Status Transition
        StatusCommand->>Logger: Log invalid status transition error
        StatusCommand->>Slack: Send status validation error notification
    else Database Update Error
        StatusCommand->>Logger: Log database update error
        StatusCommand->>HistoryRepo: transactionRollback()
        StatusCommand->>Slack: Send database error notification
    else Log Creation Error
        StatusCommand->>Logger: Log event logging error
        StatusCommand->>Slack: Send logging error notification
    end
    end
    end

Step 4: Dataset Completion Handling

sequenceDiagram
    participant StatusCommand as dataset:get-status
    participant BatchJobService as BatchJobViewpointService
    participant NotificationService as DatasetNotificationService
    participant AnalyzerDB[(gb_analyzer.batch_jobs_vps)]
    participant ConsoleDB[(gb_console)]
    participant Logger
    participant Slack
    
    Note over StatusCommand,Slack: Dataset Completion Processing
    
    rect rgb(200, 255, 200)
    Note right of StatusCommand: Happy Case - Dataset Complete
    
    rect rgb(200, 230, 255)
    Note right of StatusCommand: Batch Job Creation
    StatusCommand->>BatchJobService: createBatchJobsIfNeeded(dataset)
    BatchJobService->>BatchJobService: checkViewpointRequirements()
    BatchJobService->>BatchJobService: prepareBatchJobData()
    BatchJobService->>AnalyzerDB: INSERT INTO batch_jobs_vps
    AnalyzerDB-->>BatchJobService: Return batch job IDs
    BatchJobService-->>StatusCommand: Confirm batch jobs created
    StatusCommand->>Logger: Log batch job creation success
    end
    
    rect rgb(200, 230, 255)
    Note right of StatusCommand: Success Notification
    StatusCommand->>NotificationService: sendSuccessNotification()
    NotificationService->>ConsoleDB: INSERT INTO notifications
    ConsoleDB-->>NotificationService: Return notification ID
    NotificationService->>NotificationService: prepareEmailNotification()
    NotificationService-->>StatusCommand: Confirm notification sent
    StatusCommand->>Logger: Log success notification sent
    end
    
    rect rgb(230, 200, 255)
    Note right of StatusCommand: Success Monitoring
    StatusCommand->>Logger: Log dataset completion success with statistics
    StatusCommand->>Slack: Send dataset completion notification
    end
    end
    
    rect rgb(255, 200, 200)
    Note right of StatusCommand: Error Handling
    rect rgb(255, 230, 230)
    alt Batch Job Creation Error
        BatchJobService->>Logger: Log batch job creation error
        BatchJobService->>Slack: Send batch job error notification
        BatchJobService->>StatusCommand: Return batch job error
    else Notification Service Error
        NotificationService->>Logger: Log notification service error
        NotificationService->>Slack: Send notification error alert
        NotificationService->>StatusCommand: Return notification error
    end
    end
    end

Step 5: Dataset Failure Handling

sequenceDiagram
    participant StatusCommand as dataset:get-status
    participant HistoryRepo as WishlistDatasetHistoryRepository
    participant NotificationService as DatasetNotificationService
    participant ConsoleDB[(gb_console)]
    participant Logger
    participant Slack
    
    Note over StatusCommand,Slack: Dataset Failure Processing
    
    rect rgb(255, 255, 200)
    Note right of StatusCommand: Dataset Failed Status
    
    rect rgb(200, 230, 255)
    Note right of StatusCommand: Error Information Update
    StatusCommand->>HistoryRepo: updateErrorMessage(failureReason)
    HistoryRepo->>ConsoleDB: UPDATE wishlist_dataset_histories SET error_message
    ConsoleDB-->>HistoryRepo: Confirm error updated
    HistoryRepo-->>StatusCommand: Confirm error updated
    StatusCommand->>Logger: Log error message update
    end
    
    rect rgb(200, 230, 255)
    Note right of StatusCommand: Failure Notification
    StatusCommand->>NotificationService: sendFailureNotification()
    NotificationService->>ConsoleDB: INSERT INTO notifications (failure type)
    ConsoleDB-->>NotificationService: Return notification ID
    NotificationService->>NotificationService: prepareFailureEmail()
    NotificationService-->>StatusCommand: Confirm notification sent
    StatusCommand->>Logger: Log failure notification sent
    end
    
    rect rgb(255, 200, 200)
    Note right of StatusCommand: Administrative Alert
    StatusCommand->>Slack: Send dataset failure alert with details
    StatusCommand->>Logger: Log administrative failure notification
    end
    end

Step 6: Real-time Broadcasting

sequenceDiagram
    participant StatusCommand as dataset:get-status
    participant PusherService as Pusher Service
    participant WebSocketClients as Connected Clients
    participant Logger
    participant Slack
    
    Note over StatusCommand,Slack: Real-time Status Broadcasting
    
    rect rgb(200, 255, 200)
    Note right of StatusCommand: Happy Case - Real-time Updates
    
    rect rgb(200, 230, 255)
    Note right of StatusCommand: Completion Broadcasting
    StatusCommand->>PusherService: broadcastCompletion()
    PusherService->>PusherService: prepareCompletionEvent()
    PusherService->>WebSocketClients: Send completion event to private channels
    WebSocketClients-->>PusherService: Acknowledge receipt
    PusherService-->>StatusCommand: Confirm broadcast sent
    StatusCommand->>Logger: Log completion broadcast success
    end
    
    rect rgb(200, 230, 255)
    Note right of StatusCommand: Failure Broadcasting
    StatusCommand->>PusherService: broadcastFailure()
    PusherService->>PusherService: prepareFailureEvent()
    PusherService->>WebSocketClients: Send failure event to private channels
    WebSocketClients-->>PusherService: Acknowledge receipt
    PusherService-->>StatusCommand: Confirm broadcast sent
    StatusCommand->>Logger: Log failure broadcast success
    end
    end
    
    rect rgb(255, 200, 200)
    Note right of StatusCommand: Error Handling
    rect rgb(255, 230, 230)
    alt Pusher Service Error
        PusherService->>Logger: Log Pusher service error
        PusherService->>StatusCommand: Fallback to database-only notification
        PusherService->>Slack: Send Pusher error notification
    else WebSocket Connection Error
        PusherService->>Logger: Log WebSocket connection error
        PusherService->>StatusCommand: Continue with partial delivery
    end
    end
    end

Step 7: Transaction Management and Command Completion

sequenceDiagram
    participant StatusCommand as dataset:get-status
    participant HistoryRepo as WishlistDatasetHistoryRepository
    participant System
    participant Logger
    participant Slack
    
    Note over StatusCommand,Slack: Transaction Management and Command Completion
    
    rect rgb(200, 255, 200)
    Note right of StatusCommand: Happy Case - Successful Completion
    
    rect rgb(200, 230, 255)
    Note right of StatusCommand: Transaction Commit
    StatusCommand->>HistoryRepo: transactionCommit()
    HistoryRepo->>HistoryRepo: Commit all pending changes
    HistoryRepo-->>StatusCommand: Confirm transaction committed
    StatusCommand->>Logger: Log transaction commit success
    end
    
    rect rgb(230, 200, 255)
    Note right of StatusCommand: Command Summary
    StatusCommand->>Logger: Log command completion with overall statistics
    StatusCommand->>Logger: Log total datasets processed and status changes
    StatusCommand->>Logger: Log execution time and performance metrics
    StatusCommand->>Slack: Send monitoring summary with metrics
    end
    end
    
    rect rgb(255, 200, 200)
    Note right of StatusCommand: Error Handling and Recovery
    rect rgb(255, 230, 230)
    alt Transaction Rollback Required
        StatusCommand->>HistoryRepo: transactionRollback()
        HistoryRepo->>HistoryRepo: Rollback all pending changes
        HistoryRepo-->>StatusCommand: Confirm transaction rolled back
        StatusCommand->>Logger: Log transaction rollback with reason
        StatusCommand->>Slack: Send transaction rollback alert
    else API Error Recovery
        StatusCommand->>Logger: Log API error details with retry information
        StatusCommand->>Slack: Send API error alert with resolution steps
    else Database Error Recovery
        StatusCommand->>Logger: Log database error details
        StatusCommand->>Slack: Send database error notification
    else Critical System Error
        StatusCommand->>Logger: Log critical system error
        StatusCommand->>Slack: Send critical error alert
        StatusCommand->>System: Halt further processing
    end
    end
    end

Detail

Parameters

  • {dataset_id?}: Optional parameter to check status for a specific dataset
    • When provided, only the specified dataset's status will be checked
    • When omitted, all pending and analyzing datasets will be processed
    • Must be a valid dataset ID from wishlist_dataset_histories table
    • Useful for debugging specific dataset issues or manual status checks

Frequency

  • Scheduled Execution: Every 5 minutes
    • Configured in routes/console.php using Laravel's scheduler
    • Example: Schedule::command('dataset:get-status')->everyFiveMinutes();
  • Manual Execution: Can be triggered manually for specific datasets
    • Used for immediate status checking or troubleshooting

Dependencies

Database Dependencies:

  • Console database (gb_console) connection for dataset history records
  • Transaction support for atomic status updates
  • Proper foreign key relationships between dataset histories and wishlists

External Service Dependencies:

  • TV Python API service availability for status queries
  • Valid API credentials configured in environment variables
  • Network connectivity to the API service
  • API endpoint compatibility for dataset detail retrieval

System Dependencies:

  • Existing dataset records in wishlist_dataset_histories with valid dataset IDs
  • Proper enum mapping between API status values and internal status enum
  • Notification service configuration for user alerts
  • Pusher service configuration for real-time updates

Output

Tables

The dataset status monitoring command interacts with multiple database tables. For complete database schema and table relationships, see the Database Schema section.

Primary Output Tables:

  • wishlist_dataset_histories: Updates dataset status and error messages
  • wishlist_dataset_creation_logs: Logs all status change events
  • wishlist_to_groups: Updates manual request flags for completed datasets

Command-Specific Operations:

  • Updates: Dataset status transitions (Pending → Analyzing → Complete/Failed)
  • Logs: Status change events in wishlist_dataset_creation_logs with timestamps
  • Creates: Batch job records for completed datasets requiring viewpoint analysis

Services

TV Python API:

  • Dataset detail endpoint for status retrieval
  • Returns current processing status and progress information
  • Provides error details for failed dataset processing

Notification Services:

  • In-app notifications via DatasetNotification class for completion/failure
  • Email notifications for dataset processing updates
  • Slack alerts for administrators via DatasetSlackChannel
  • Real-time updates via Pusher for connected clients and dashboards

Repository Services:

  • WishlistDatasetHistoryRepositoryInterface: Status updates and transaction management
  • DatasetCreationLogService: Event logging with status change details
  • BatchJobViewpointService: Batch job creation for completed datasets
  • DatasetNotificationService: Multi-channel notification orchestration

Background Job Services:

  • CancelCrawlerScheduleOnceJob: Cleanup for manual training schedules
  • Batch job creation for specific viewpoint processing
  • Queue management for asynchronous notification delivery

Error Handling

Log

The system generates comprehensive logs for troubleshooting status monitoring issues:

API Communication Errors:

  • Failed API requests with full error response details and status codes
  • Timeout errors with retry attempt information and API endpoint details
  • Authentication failures with credential validation suggestions
  • Rate limiting errors with backoff strategy recommendations

Status Processing Errors:

  • Invalid status values with API response validation details
  • Status mapping errors between API and internal enum values
  • Database transaction failures with rollback information and affected records
  • Notification delivery failures with service-specific error codes

Log Locations:

  • Application logs: storage/logs/laravel.log with contextual dataset information
  • Command-specific logs with execution statistics and API response times
  • Error logs with full stack traces and API request/response details for debugging

Slack

Automated Slack notifications are sent via DatasetSlackChannel for operational monitoring:

Success Notifications:

  • Status monitoring completion with processing statistics and timing
  • Dataset completion notifications with wishlist details and processing duration
  • Batch processing summaries with success/failure counts and performance metrics

Error Notifications:

  • API communication failures with error codes and suggested resolution steps
  • Database operation failures with affected dataset details and rollback information
  • Status inconsistency alerts with data validation recommendations
  • Critical system errors requiring immediate administrative attention

Notification Format:

  • Command name and execution timestamp for operational tracking
  • Error type and severity level for incident prioritization
  • Affected dataset IDs and wishlist groups for investigation
  • Suggested troubleshooting steps and documentation references

Troubleshooting

Check Data

Verify Pending Datasets:

-- Check datasets awaiting status updates
SELECT wdh.id, wdh.dataset_id, wdh.status, wdh.created_at, wdh.updated_at,
       wtg.name as wishlist_name, wtg.training_schedule
FROM wishlist_dataset_histories wdh
JOIN wishlist_to_groups wtg ON wdh.wishlist_to_group_id = wtg.id
WHERE wdh.status IN ('Pending', 'Analyzing')
ORDER BY wdh.updated_at ASC;

Check Stuck Datasets:

-- Find datasets stuck in processing (older than 2 hours)
SELECT wdh.id, wdh.dataset_id, wdh.status, wdh.updated_at,
       TIMESTAMPDIFF(MINUTE, wdh.updated_at, NOW()) as minutes_stuck,
       wtg.name as wishlist_name
FROM wishlist_dataset_histories wdh
JOIN wishlist_to_groups wtg ON wdh.wishlist_to_group_id = wtg.id
WHERE wdh.status IN ('Pending', 'Analyzing')
AND wdh.updated_at < DATE_SUB(NOW(), INTERVAL 2 HOUR)
ORDER BY wdh.updated_at ASC;

Check Recent Status Changes:

-- Review recent status change events
SELECT wdcl.*, wtg.name as wishlist_name, wdh.dataset_id
FROM wishlist_dataset_creation_logs wdcl
JOIN wishlist_to_groups wtg ON wdcl.wishlist_to_group_id = wtg.id
JOIN wishlist_dataset_histories wdh ON wdcl.wishlist_dataset_history_id = wdh.id
WHERE wdcl.event_type = 'Status Update'
AND wdcl.created_at > DATE_SUB(NOW(), INTERVAL 1 DAY)
ORDER BY wdcl.created_at DESC LIMIT 20;

Check Logs

Application Logs:

# Check recent dataset:get-status command logs
tail -f storage/logs/laravel.log | grep -E "dataset:get-status"

# Check API communication logs
grep "AnalyzerApiService.*dataset.*detail" storage/logs/laravel.log | tail -20

# Check status change logs
grep "Dataset status changed" storage/logs/laravel.log | tail -10

# Check error patterns
grep -E "(ERROR|CRITICAL)" storage/logs/laravel.log | grep "dataset:get-status" | tail -10

Database Logs:

-- Check failed status update attempts
SELECT wdcl.*, wtg.name as wishlist_name
FROM wishlist_dataset_creation_logs wdcl
JOIN wishlist_to_groups wtg ON wdcl.wishlist_to_group_id = wtg.id
WHERE wdcl.event_type = 'Failure'
AND wdcl.message LIKE '%status%'
AND wdcl.created_at > DATE_SUB(NOW(), INTERVAL 1 DAY)
ORDER BY wdcl.created_at DESC;

-- Check datasets with error messages
SELECT wdh.id, wdh.dataset_id, wdh.error_message, wdh.updated_at,
       wtg.name as wishlist_name
FROM wishlist_dataset_histories wdh
JOIN wishlist_to_groups wtg ON wdh.wishlist_to_group_id = wtg.id
WHERE wdh.error_message IS NOT NULL
AND wdh.updated_at > DATE_SUB(NOW(), INTERVAL 1 DAY);

API Response Validation:


# Test dataset detail endpoint
curl -X GET "https://api.analyzer.example.com/datasets/DATASET_ID" \
  -H "Authorization: Bearer YOUR_API_TOKEN"

# Check API rate limits
curl -I "https://api.analyzer.example.com/datasets" \
  -H "Authorization: Bearer YOUR_API_TOKEN"

Performance Monitoring:

  • Monitor command execution times for performance degradation
  • Check API response times and timeout configurations
  • Verify database transaction performance for large batches
  • Review notification delivery times and queue processing
  • Monitor Pusher service performance for real-time updates