Local DB Sync Overview

Description

The Local DB Sync component is responsible for internal synchronization operations within the gb_analyzer database. This component processes temporary ranking data (prefixed with 't_') that is directly populated by the Crawler and transforms it into clean format in the main ranking tables. This ensures that product category rankings and product search query rankings are properly maintained and updated for analysis and reporting purposes.

Overview System Diagram

---
config:
  theme: base
  layout: dagre
  flowchart:
    curve: linear
    htmlLabels: true
  themeVariables:
    edgeLabelBackground: "transparent"
---
flowchart TD
    Crawler((Crawler System))
    SyncCommands[LocalDB Sync Commands]
    
    subgraph TempTables["gb_analyzer Temporary Tables"]
        t_category_rankings[(t_category_rankings)]
        t_sq_rankings[(t_sq_rankings)]
    end
    
    subgraph MainTables["gb_analyzer Main Tables"]
        CategoryRankings[(category_rankings)]
        SearchQueryRankings[(search_query_rankings)]
        ProductCategoryRankings[(product_category_rankings)]
        ProductSearchQueryRankings[(product_search_query_rankings)]
    end
    
    Crawler --- CrawlerStep[
        <div style='text-align: center'>
            <span style='display: inline-block; background-color: #6699cc !important; color:white; width: 28px; height: 28px; line-height: 28px; border-radius: 50%; font-weight: bold'>1</span>
            <p style='margin-top: 8px'>Store Ranking Data</p>
        </div>
    ]
    CrawlerStep --> TempTables
    
    TempTables --- TempTablesStep[
        <div style='text-align: center'>
            <span style='display: inline-block; background-color: #6699cc !important; color:white; width: 28px; height: 28px; line-height: 28px; border-radius: 50%; font-weight: bold'>2</span>
            <p style='margin-top: 8px'>Identify status=0 NotSynced</p>
        </div>
    ]
    TempTablesStep --> SyncCommands
    
    SyncCommands --- CategoryStep[
        <div style='text-align: center'>
            <span style='display: inline-block; background-color: #99cc66 !important; color:white; width: 28px; height: 28px; line-height: 28px; border-radius: 50%; font-weight: bold'>3A</span>
            <p style='margin-top: 8px'>Store Category Data</p>
        </div>
    ]
    CategoryStep --> CategoryRankings
    
    SyncCommands --- ProductCategoryStep[
        <div style='text-align: center'>
            <span style='display: inline-block; background-color: #99cc66 !important; color:white; width: 28px; height: 28px; line-height: 28px; border-radius: 50%; font-weight: bold'>3B</span>
            <p style='margin-top: 8px'>Store Product Category Rankings</p>
        </div>
    ]
    ProductCategoryStep --> ProductCategoryRankings
    
    SyncCommands --- SearchQueryStep[
        <div style='text-align: center'>
            <span style='display: inline-block; background-color: #cc66cc !important; color:white; width: 28px; height: 28px; line-height: 28px; border-radius: 50%; font-weight: bold'>4A</span>
            <p style='margin-top: 8px'>Store Search Query Data</p>
        </div>
    ]
    SearchQueryStep --> SearchQueryRankings
    
    SyncCommands --- ProductSearchStep[
        <div style='text-align: center'>
            <span style='display: inline-block; background-color: #cc66cc !important; color:white; width: 28px; height: 28px; line-height: 28px; border-radius: 50%; font-weight: bold'>4B</span>
            <p style='margin-top: 8px'>Store Product Search Rankings</p>
        </div>
    ]
    ProductSearchStep --> ProductSearchQueryRankings
    
    SyncCommands --- StatusUpdateStep[
        <div style='text-align: center'>
            <span style='display: inline-block; background-color: #6699cc !important; color:white; width: 28px; height: 28px; line-height: 28px; border-radius: 50%; font-weight: bold'>5</span>
            <p style='margin-top: 8px'>Mark as status=1 Synced</p>
        </div>
    ]
    StatusUpdateStep --> TempTables
        
    style Crawler fill:#fcd9f2,stroke:#ff9900,stroke-width:2px
    style SyncCommands fill:#d9f2d9
    style TempTables fill:#fcd9f2,stroke:#ff9900,stroke-width:1px
    style MainTables fill:#d9d9f2,stroke:#339933,stroke-width:1px
    style CategoryRankings fill:#e6f0ff,stroke:#6666cc,stroke-width:1px
    style SearchQueryRankings fill:#e6f0ff,stroke:#6666cc,stroke-width:1px
    style ProductCategoryRankings fill:#e6f0ff,stroke:#6666cc,stroke-width:1px
    style ProductSearchQueryRankings fill:#e6f0ff,stroke:#6666cc,stroke-width:1px
    style t_category_rankings fill:#fcd9f2,stroke:#ff6666,stroke-width:1px
    style t_sq_rankings fill:#fcd9f2,stroke:#ff6666,stroke-width:1px
    style CrawlerStep fill:transparent,stroke:transparent,stroke-width:1px
    style TempTablesStep fill:transparent,stroke:transparent,stroke-width:1px
    style CategoryStep fill:transparent,stroke:transparent,stroke-width:1px
    style ProductCategoryStep fill:transparent,stroke:transparent,stroke-width:1px
    style SearchQueryStep fill:transparent,stroke:transparent,stroke-width:1px
    style ProductSearchStep fill:transparent,stroke:transparent,stroke-width:1px
    style StatusUpdateStep fill:transparent,stroke:transparent,stroke-width:1px

Detail Dataflow Dependency

The Local DB Sync component follows a data flow where:

  1. New ranking data is first stored in temporary tables (t_category_rankings, t_sq_rankings) by the Crawler system
  2. The localdb:sync-product-category-rankings and localdb:sync-product-search-query-rankings commands identify records with status =0 NotSynced in temporary tables
  3. For category rankings:
    • Category data is extracted and stored in category_rankings table
    • Product ranking data is processed and stored in product_category_rankings table
  4. For search query rankings:
    • Search query data is extracted and stored in search_query_rankings table
    • Product ranking data is processed and stored in product_search_query_rankings table
  5. Processed temporary records are marked with status =1 Synced

Frequency Overview

Timeline

timeline
    title Local DB Sync Schedule
    section Regular Operations
        Every 5 minutes<br>(Ex. 08.00, 08.05) : localdb sync-product-category-rankings
                                              : localdb sync-product-search-query-rankings

End-user Impact

When these commands execute successfully, end-users benefit from:

  • Complete product category ranking information for comprehensive category analysis
  • Accurate product search query ranking data for search performance monitoring
  • Consistent data integrity within the local database
  • Reliable recovery of any missed ranking data through frequent checks
  • Enhanced analytical capabilities for category and search performance
  • Real-time visibility into product positioning across categories and search results
  • Trend identification across different marketplaces and time periods

Database Schema

erDiagram
    t_category_rankings {
        bigint id PK
        string product_id "Product ID (nullable, indexed)"
        integer mall_id "Mall identifier (nullable, indexed)"
        string title "Product title (nullable)"
        float price "Product price (nullable)"
        string image "Product image URL (nullable, max 1000 chars)"
        string shop_url "Shop page URL (nullable, max 1000 chars)"
        string shop_name "Shop name (nullable)"
        string product_url "Product page URL (nullable, max 1000 chars)"
        string category_id "Category ID (nullable, indexed)"
        string category_name "Category name (nullable)"
        integer ranking "Product rank in category (nullable)"
        timestamp crawl_created_at "Timestamp from crawler (nullable, indexed)"
        tinyint status "Sync status: 0=NotSynced, 1=Synced (default 0, indexed)"
    }
    
    t_sq_rankings {
        bigint id PK
        string image "Product image URL (nullable, max 255 chars)"
        integer point "Product points (nullable)"
        float price "Product price (nullable)"
        string title "Product title (nullable, max 255 chars)"
        string keyword "Search keyword (nullable, indexed, max 100 chars)"
        integer mall_id "Mall identifier (nullable, indexed)"
        integer ranking "Product ranking for keyword (nullable)"
        tinyint sponsor "Sponsored product flag (nullable)"
        string shop_url "Shop URL (nullable, max 255 chars)"
        string shop_name "Shop name (nullable, max 100 chars)"
        string product_id "Product identifier (nullable, indexed)"
        string product_url "Product URL (nullable, max 255 chars)"
        timestamp crawl_created_at "Timestamp from crawler (nullable, indexed)"
        tinyint status "Sync status: 0=NotSynced, 1=Synced (default 0, indexed)"
    }
    
    category_rankings {
        bigint id PK
        string mall_category_id "Category ID in mall system (indexed)"
        string mall_category_name "Category name in mall (nullable)"
        integer mall_id "Mall identifier"
        string unique_key "System-generated unique key (unique, indexed, utf8mb4_bin)"
        timestamp crawl_created_at "Timestamp from crawler (nullable, indexed)"
    }
    
    search_query_rankings {
        bigint id PK
        string keyword "Search query text (unique, indexed)"
        timestamp crawl_created_at "Timestamp from crawler (nullable, indexed)"
    }
    
    product_category_rankings {
        bigint id PK
        bigint category_ranking_id FK
        integer ranking "Product rank in category (default 0, indexed)"
        string mall_product_id "Product ID in mall system (indexed)"
        string product_name "Product name"
        double price "Product price (default 0)"
        string shop_name "Shop name"
        text image "Product image URL"
        tinytext product_url "Product page URL"
        tinytext shop_url "Shop page URL"
        tinyint mall_id "Mall identifier (indexed)"
        string unique_key "System-generated unique key (unique, indexed, utf8mb4_bin)"
        timestamp crawl_created_at "Timestamp from crawler (nullable, indexed)"
    }
    
    product_search_query_rankings {
        bigint id PK
        bigint search_query_ranking_id FK
        string mall_product_id "asin or rakuten_id (indexed)"
        text product_name "Product name"
        text image "Product image URL"
        tinytext product_url "Product page URL"
        tinytext shop_url "Shop page URL (nullable)"
        string shop_name "Shop name (nullable)"
        tinyint sponsor "0: organic, 1: SP, 2: AB (default 0, indexed)"
        float point "Points (default 0)"
        integer mall_id "Mall identifier (default 0, indexed)"
        integer ranking "Product rank for search query (default 0, indexed)"
        string unique_key "System-generated unique key (unique, indexed, utf8mb4_bin)"
        timestamp crawl_created_at "Timestamp from crawler (nullable, indexed)"
    }
    
    t_category_rankings ||--o{ category_rankings : "synced to"
    t_category_rankings ||--o{ product_category_rankings : "synced to"
    t_sq_rankings ||--o{ search_query_rankings : "synced to"
    t_sq_rankings ||--o{ product_search_query_rankings : "synced to"
    category_rankings ||--o{ product_category_rankings : "has rankings"
    search_query_rankings ||--o{ product_search_query_rankings : "has rankings"

Table Categories

Temporary Tables (Populated by Crawler)

  • t_category_rankings: Holds temporary category ranking data with sync status tracking
  • t_sq_rankings: Holds temporary search query ranking data with sync status tracking

Main Tables (Populated by LocalDB Sync)

  • category_rankings: Stores unique category information extracted from temporary data
  • search_query_rankings: Stores unique search queries extracted from temporary data
  • product_category_rankings: Stores detailed product rankings within categories
  • product_search_query_rankings: Stores detailed product rankings for search queries

Batch List

Name Description
Missed Rankings Sync Commands that run every 5 minutes to sync ranking data from temporary tables to main tables