Local DB Missed Rankings Sync

Command Signatures

php artisan localdb:sync-product_category_rankings --missed [--items-per-page=]
php artisan localdb:sync-product_search_query_rankings --missed [--items-per-page=]

Purpose

These commands process and synchronize temporary ranking data (category rankings and search query rankings) that have not yet been processed. They identify records with status = NotSynced in temporary tables and transfer them to the main database tables with proper relationships.

Sequence Diagrams

Category Rankings Sync Flow

sequenceDiagram
    participant System
    participant CategorySync as localdb:sync-product_category_rankings --missed
    participant TempCategoryTable as t_category_rankings
    participant CategoryTable as category_rankings
    participant ProductCategoryTable as product_category_rankings
    participant Logger
    participant Slack
    
    Note over System,Slack: Category Rankings Sync Flow (Every 5 Minutes)
    
    rect rgb(200, 255, 200)
    Note right of System: Happy Case - Normal Processing
    
    System->>CategorySync: Execute Command
    CategorySync->>Logger: Log job start
    
    CategorySync->>TempCategoryTable: Query WHERE status =0 NotSynced
    TempCategoryTable-->>CategorySync: Return unprocessed records
    
    rect rgb(200, 230, 255)
    alt Records Found
        Note right of CategorySync: Data Processing
        CategorySync->>CategorySync: Extract unique categories
        CategorySync->>CategoryTable: insertOrIgnore unique categories
        CategoryTable-->>CategorySync: Return inserted category IDs
        
        CategorySync->>CategorySync: Build product-category ranking records
        CategorySync->>ProductCategoryTable: Insert product rankings with foreign keys
        ProductCategoryTable-->>CategorySync: Confirm insertion
        
        CategorySync->>TempCategoryTable: Update status to 1 (Synced)
        TempCategoryTable-->>CategorySync: Confirm status update
        
        rect rgb(230, 200, 255)
        Note right of CategorySync: Success Monitoring
        CategorySync->>Logger: Log success with statistics
        CategorySync->>Slack: Send success notification
        end
    else No Records
        Note right of CategorySync: No Data Scenario
        CategorySync->>Logger: Log no records to process
    end
    end
    end
    
    rect rgb(255, 200, 200)
    Note right of System: Error Handling
    rect rgb(255, 230, 230)
    alt Error Occurs
        CategorySync->>Logger: Log error details
        CategorySync->>Slack: Send error notification
        CategorySync->>CategorySync: Rollback transaction
    end
    end
    end

Search Query Rankings Sync Flow

sequenceDiagram
    participant System
    participant SearchSync as localdb:sync-product_search_query_rankings --missed
    participant TempSearchTable as t_sq_rankings
    participant SearchQueryTable as search_query_rankings
    participant ProductSearchTable as product_search_query_rankings
    participant Logger
    participant Slack
    
    Note over System,Slack: Search Query Rankings Sync Flow (Every 5 Minutes)
    
    rect rgb(200, 255, 200)
    Note right of System: Happy Case - Normal Processing
    
    System->>SearchSync: Execute Command
    SearchSync->>Logger: Log job start
    
    SearchSync->>TempSearchTable: Query WHERE status =0 NotSynced
    TempSearchTable-->>SearchSync: Return unprocessed records
    
    rect rgb(200, 230, 255)
    alt Records Found
        Note right of SearchSync: Data Processing
        SearchSync->>SearchSync: Extract unique search queries
        SearchSync->>SearchQueryTable: insertOrIgnore unique search queries
        SearchQueryTable-->>SearchSync: Return inserted search query IDs
        
        SearchSync->>SearchSync: Build product-search query ranking records
        SearchSync->>ProductSearchTable: Insert product rankings with foreign keys
        ProductSearchTable-->>SearchSync: Confirm insertion
        
        SearchSync->>TempSearchTable: Update status to 1 (Synced)
        TempSearchTable-->>SearchSync: Confirm status update
        
        rect rgb(230, 200, 255)
        Note right of SearchSync: Success Monitoring
        SearchSync->>Logger: Log success with statistics
        SearchSync->>Slack: Send success notification
        end
    else No Records
        Note right of SearchSync: No Data Scenario
        SearchSync->>Logger: Log no records to process
    end
    end
    end
    
    rect rgb(255, 200, 200)
    Note right of System: Error Handling
    rect rgb(255, 230, 230)
    alt Error Occurs
        SearchSync->>Logger: Log error details
        SearchSync->>Slack: Send error notification
        SearchSync->>SearchSync: Rollback transaction
    end
    end
    end

Implementation Details

Command Structure

Both commands inherit from BaseLocalDBSyncCommand which provides:

  1. Query building with conditions based on status
  2. Chunking of data for batch processing
  3. Error handling and logging
  4. Database transaction management

Parameters

  • --missed: Process only records with status=0 NotSynced (this is the primary mode)
  • --items-per-page=N: Control batch size for processing (default: 500)
  • --all: Process all records regardless of status (rarely used)

Processing Flow

For category rankings:

  1. Query temporary records with 0 (NotSynced) status
  2. Extract unique categories and insert into category_rankings
  3. Look up inserted category IDs
  4. Create product-category ranking records with proper foreign keys
  5. Insert into product_category_rankings
  6. Update status of processed temporary records

For search query rankings:

  1. Query temporary records with 0 (NotSynced) status
  2. Extract unique search queries and insert into search_query_rankings
  3. Look up inserted search query IDs
  4. Create product-search query ranking records with proper foreign keys
  5. Insert into product_search_query_rankings
  6. Update status of processed temporary records

Database Schema

erDiagram
    t_category_rankings {
        bigint id PK
        string product_id "Product ID (nullable, indexed)"
        integer mall_id "Mall identifier (nullable, indexed)"
        string title "Product title (nullable)"
        float price "Product price (nullable)"
        string image "Product image URL (nullable, max 1000 chars)"
        string shop_url "Shop page URL (nullable, max 1000 chars)"
        string shop_name "Shop name (nullable)"
        string product_url "Product page URL (nullable, max 1000 chars)"
        string category_id "Category ID (nullable, indexed)"
        string category_name "Category name (nullable)"
        integer ranking "Product rank in category (nullable)"
        timestamp crawl_created_at "Timestamp from crawler (nullable, indexed)"
        tinyint status "Sync status: 0=NotSynced, 1=Synced (default 0, indexed)"
    }
    
    t_sq_rankings {
        bigint id PK
        string image "Product image URL (nullable, max 255 chars)"
        integer point "Product points (nullable)"
        float price "Product price (nullable)"
        string title "Product title (nullable, max 255 chars)"
        string keyword "Search keyword (nullable, indexed, max 100 chars)"
        integer mall_id "Mall identifier (nullable, indexed)"
        integer ranking "Product ranking for keyword (nullable)"
        tinyint sponsor "Sponsored product flag (nullable)"
        string shop_url "Shop URL (nullable, max 255 chars)"
        string shop_name "Shop name (nullable, max 100 chars)"
        string product_id "Product identifier (nullable, indexed)"
        string product_url "Product URL (nullable, max 255 chars)"
        timestamp crawl_created_at "Timestamp from crawler (nullable, indexed)"
        tinyint status "Sync status: 0=NotSynced, 1=Synced (default 0, indexed)"
    }
    
    category_rankings {
        bigint id PK
        string mall_category_id "Category ID in mall system (indexed)"
        string mall_category_name "Category name in mall (nullable)"
        integer mall_id "Mall identifier"
        string unique_key "System-generated unique key (unique, indexed, utf8mb4_bin)"
        timestamp crawl_created_at "Timestamp from crawler (nullable, indexed)"
    }
    
    search_query_rankings {
        bigint id PK
        string keyword "Search query text (unique, indexed)"
        timestamp crawl_created_at "Timestamp from crawler (nullable, indexed)"
    }
    
    product_category_rankings {
        bigint id PK
        bigint category_ranking_id FK
        integer ranking "Product rank in category (default 0, indexed)"
        string mall_product_id "Product ID in mall system (indexed)"
        string product_name "Product name"
        double price "Product price (default 0)"
        string shop_name "Shop name"
        text image "Product image URL"
        tinytext product_url "Product page URL"
        tinytext shop_url "Shop page URL"
        tinyint mall_id "Mall identifier (indexed)"
        string unique_key "System-generated unique key (unique, indexed, utf8mb4_bin)"
        timestamp crawl_created_at "Timestamp from crawler (nullable, indexed)"
    }
    
    product_search_query_rankings {
        bigint id PK
        bigint search_query_ranking_id FK
        string mall_product_id "asin or rakuten_id (indexed)"
        text product_name "Product name"
        text image "Product image URL"
        tinytext product_url "Product page URL"
        tinytext shop_url "Shop page URL (nullable)"
        string shop_name "Shop name (nullable)"
        tinyint sponsor "0: organic, 1: SP, 2: AB (default 0, indexed)"
        float point "Points (default 0)"
        integer mall_id "Mall identifier (default 0, indexed)"
        integer ranking "Product rank for search query (default 0, indexed)"
        string unique_key "System-generated unique key (unique, indexed, utf8mb4_bin)"
        timestamp crawl_created_at "Timestamp from crawler (nullable, indexed)"
    }
    
    t_category_rankings ||--o{ category_rankings : "synced to"
    t_category_rankings ||--o{ product_category_rankings : "synced to"
    t_sq_rankings ||--o{ search_query_rankings : "synced to"
    t_sq_rankings ||--o{ product_search_query_rankings : "synced to"
    category_rankings ||--o{ product_category_rankings : "has rankings"
    search_query_rankings ||--o{ product_search_query_rankings : "has rankings"

Error Handling

Transaction Management

  • Each job operates within a database transaction
  • Rollback on error to maintain data integrity
  • Retry mechanism for handling transient database errors

Logging

  • Detailed logs of job start/completion
  • Error logs with file and line information
  • Statistics including job counts and record counts

Slack Notifications

  • Error notifications sent to LocalDB Slack channel
  • Success reporting with statistics

Troubleshooting

Common Issues

  1. Duplicate Records: Handled by unique keys and insertOrIgnore
  2. Missing Relationships: Job skips records where relationships can't be established
  3. Transaction Timeouts: May occur with large batches - adjust batch size if needed
  4. Queue Bottlenecks: Monitor queue performance for delayed processing

Verification Steps

  1. Check temporary tables for remaining records with 0 (NotSynced) status
  2. Verify destination tables for expected data
  3. Monitor logs for successful job completion
  4. Check error logs for specific failure reasons