Sync Success
Command Signature
php artisan analyzerdb:sync-crawl-success-from-analyzer --data-type=SummaryProduct [--limit=100]
php artisan analyzerdb:sync-crawl-success-from-analyzer --data-type=SummaryProductReview [--limit=100]
php artisan analyzerdb:sync-crawl-success-from-analyzer --data-type=SummaryCategory [--limit=100]
php artisan analyzerdb:sync-crawl-success-from-analyzer --data-type=SummarySearchQuery [--limit=100]
Purpose
These commands verify whether records from the summary wishlist tables in the Console database (gb_console) have been successfully processed in the Analyzer database (gb_analyzer). When matching records are found in the Analyzer database with appropriate timestamps, the corresponding records in the Console database are updated with a "Success" crawl status, maintaining accurate tracking of the data processing pipeline across both databases.
Sequence Diagram
Main Command Flow
sequenceDiagram
participant System
participant Command as analyzerdb:sync-crawl-success-from-analyzer
participant Repository as SummaryWishlist*Repository
participant ConsoleDB[(gb_console)]
participant Logger
participant Slack
Note over System,Slack: Command Execution Flow (Every 5 Minutes)
rect rgb(200, 255, 200)
Note right of System: Happy Case - Command Processing
System->>Command: Execute with --data-type parameter
Command->>Logger: Log command start with parameters
Command->>Repository: chunkDataUpdateCrawlStatusSuccess()
Repository->>ConsoleDB: Query records WHERE crawl_status = Crawling
ConsoleDB-->>Repository: Return records needing verification
Repository-->>Command: Return chunks of records (max: limit option)
rect rgb(200, 230, 255)
loop For each chunk of records
Note right of Command: Batch Job Dispatch
Command->>Command: Dispatch specialized job based on data type
Command->>Logger: Log chunk processing start
end
end
Command->>Logger: Log command completion
Command->>Slack: Send processing summary notification
end
rect rgb(255, 200, 200)
Note right of System: Error Handling
rect rgb(255, 230, 230)
alt Command Error Occurs
Command->>Logger: Log command error details
Command->>Slack: Send error notification
end
end
end
SummaryProduct Job Flow
sequenceDiagram
participant Job as SummaryProductJob
participant ProductRepo as ProductRepositoryInterface
participant AnalyzerDB[(gb_analyzer.products)]
participant ConsoleRepo as SummaryWishlistProductRepository
participant ConsoleDB[(gb_console.summary_wishlist_products)]
participant Logger
participant Slack
Note over Job,Slack: Product Verification Job Flow
rect rgb(200, 255, 200)
Note right of Job: Happy Case - Product Verification
Job->>Logger: Log job start with record count
Job->>Job: Extract mall_id, input_type, input from records
Job->>ProductRepo: Query products with lookup keys
ProductRepo->>AnalyzerDB: SELECT WHERE mall_id, input_type, input match
Note right of AnalyzerDB: Time constraint: crawl_created_at >= updated_at
AnalyzerDB-->>ProductRepo: Return matching products
ProductRepo-->>Job: Return verified product records
rect rgb(255, 255, 200)
alt Products found in Analyzer DB
Note right of Job: Success Update
Job->>ConsoleRepo: updateByConditions(crawl_status = Success)
ConsoleRepo->>ConsoleDB: UPDATE crawl_status = 2 WHERE id IN (...)
ConsoleDB-->>ConsoleRepo: Confirm batch update
ConsoleRepo-->>Job: Return update count
rect rgb(230, 200, 255)
Note right of Job: Success Monitoring
Job->>Logger: Log successful updates with count
Job->>Slack: Send success notification with statistics
end
else No matching products
Note right of Job: No Action Required
Job->>Logger: Log no updates needed
end
end
end
rect rgb(255, 200, 200)
Note right of Job: Error Handling
rect rgb(255, 230, 230)
alt Database Error Occurs
Job->>Job: Detect deadlock/connection error
Job->>Job: Implement exponential backoff retry
Job->>Logger: Log retry attempt with sleep time
rect rgb(233, 196, 196)
alt Max retries exceeded (5 attempts)
Job->>Logger: Log final error with stack trace
Job->>Slack: Send error notification
end
end
end
end
end
SummaryProductReview Job Flow
sequenceDiagram
participant Job as SummaryProductReviewJob
participant ReviewRepo as ReviewRepositoryInterface
participant AnalyzerDB[(gb_analyzer.reviews)]
participant ConsoleRepo as SummaryWishlistProductReviewRepository
participant ConsoleDB[(gb_console.summary_wishlist_product_reviews)]
participant Logger
participant Slack
Note over Job,Slack: Product Review Verification Job Flow
rect rgb(200, 255, 200)
Note right of Job: Happy Case - Review Verification
Job->>Logger: Log job start with record count
Job->>Job: Extract product identifiers from related products
Job->>ReviewRepo: Query reviews with product identifiers
ReviewRepo->>AnalyzerDB: SELECT WHERE mall_id, product identifiers match
Note right of AnalyzerDB: Time constraint: crawl_created_at >= updated_at
AnalyzerDB-->>ReviewRepo: Return matching reviews
ReviewRepo-->>Job: Return verified review records
rect rgb(255, 255, 200)
alt Reviews found in Analyzer DB
Note right of Job: Success Update
Job->>ConsoleRepo: updateBySummaryWishlistProductIds()
ConsoleRepo->>ConsoleDB: UPDATE crawl_status = 2 WHERE product_id IN (...)
ConsoleDB-->>ConsoleRepo: Confirm batch update
ConsoleRepo-->>Job: Return update count
rect rgb(230, 200, 255)
Note right of Job: Success Monitoring
Job->>Logger: Log successful updates with count
Job->>Slack: Send success notification with statistics
end
else No matching reviews
Note right of Job: No Action Required
Job->>Logger: Log no updates needed
end
end
end
rect rgb(255, 200, 200)
Note right of Job: Error Handling
rect rgb(255, 230, 230)
alt Database Error Occurs
Job->>Job: Detect deadlock/connection error
Job->>Job: Implement exponential backoff retry
Job->>Logger: Log retry attempt with sleep time
rect rgb(233, 196, 196)
alt Max retries exceeded (5 attempts)
Job->>Logger: Log final error with stack trace
Job->>Slack: Send error notification
end
end
end
end
end
SummaryCategory Job Flow
sequenceDiagram
participant Job as SummaryCategoryJob
participant CategoryRepo as TempCategoryRankingRepositoryInterface
participant AnalyzerDB[(gb_analyzer.t_category_rankings)]
participant ConsoleRepo as SummaryWishlistCategoryRepository
participant ConsoleDB[(gb_console.summary_wishlist_categories)]
participant Logger
participant Slack
Note over Job,Slack: Category Verification Job Flow
rect rgb(200, 255, 200)
Note right of Job: Happy Case - Category Verification
Job->>Logger: Log job start with record count
Job->>Job: Extract mall_id, category_id from records
Job->>CategoryRepo: Query category rankings with lookup keys
CategoryRepo->>AnalyzerDB: SELECT WHERE mall_id, category_id match
Note right of AnalyzerDB: Time constraint: crawl_created_at >= updated_at
AnalyzerDB-->>CategoryRepo: Return matching category rankings
CategoryRepo-->>Job: Return verified category records
rect rgb(255, 255, 200)
alt Categories found in Analyzer DB
Note right of Job: Success Update
Job->>ConsoleRepo: updateByConditions(crawl_status = Success)
ConsoleRepo->>ConsoleDB: UPDATE crawl_status = 2 WHERE id IN (...)
ConsoleDB-->>ConsoleRepo: Confirm batch update
ConsoleRepo-->>Job: Return update count
rect rgb(230, 200, 255)
Note right of Job: Success Monitoring
Job->>Logger: Log successful updates with count
Job->>Slack: Send success notification with statistics
end
else No matching categories
Note right of Job: No Action Required
Job->>Logger: Log no updates needed
end
end
end
rect rgb(255, 200, 200)
Note right of Job: Error Handling
rect rgb(255, 230, 230)
alt Database Error Occurs
Job->>Job: Detect deadlock/connection error
Job->>Job: Implement exponential backoff retry
Job->>Logger: Log retry attempt with sleep time
rect rgb(233, 196, 196)
alt Max retries exceeded (5 attempts)
Job->>Logger: Log final error with stack trace
Job->>Slack: Send error notification
end
end
end
end
end
SummarySearchQuery Job Flow
sequenceDiagram
participant Job as SummarySearchQueryJob
participant SearchRepo as TempSearchQueryRankingRepositoryInterface
participant AnalyzerDB[(gb_analyzer.t_sq_rankings)]
participant ConsoleRepo as SummaryWishlistSearchQueryRepository
participant ConsoleDB[(gb_console.summary_wishlist_search_queries)]
participant Logger
participant Slack
Note over Job,Slack: Search Query Verification Job Flow
rect rgb(200, 255, 200)
Note right of Job: Happy Case - Search Query Verification
Job->>Logger: Log job start with record count
Job->>Job: Extract mall_id, keyword from records
Job->>SearchRepo: Query search query rankings with lookup keys
SearchRepo->>AnalyzerDB: SELECT WHERE mall_id, keyword match
Note right of AnalyzerDB: Time constraint: crawl_created_at >= updated_at
AnalyzerDB-->>SearchRepo: Return matching search query rankings
SearchRepo-->>Job: Return verified search query records
rect rgb(255, 255, 200)
alt Search queries found in Analyzer DB
Note right of Job: Success Update
Job->>ConsoleRepo: updateByConditions(crawl_status = Success)
ConsoleRepo->>ConsoleDB: UPDATE crawl_status = 2 WHERE id IN (...)
ConsoleDB-->>ConsoleRepo: Confirm batch update
ConsoleRepo-->>Job: Return update count
rect rgb(230, 200, 255)
Note right of Job: Success Monitoring
Job->>Logger: Log successful updates with count
Job->>Slack: Send success notification with statistics
end
else No matching search queries
Note right of Job: No Action Required
Job->>Logger: Log no updates needed
end
end
end
rect rgb(255, 200, 200)
Note right of Job: Error Handling
rect rgb(255, 230, 230)
alt Database Error Occurs
Job->>Job: Detect deadlock/connection error
Job->>Job: Implement exponential backoff retry
Job->>Logger: Log retry attempt with sleep time
rect rgb(233, 196, 196)
alt Max retries exceeded (5 attempts)
Job->>Logger: Log final error with stack trace
Job->>Slack: Send error notification
end
end
end
end
end
Database Schema
Console Database Tables (gb_console)
erDiagram
summary_wishlist_products {
bigint id PK
string input "The input of the product"
string input_type "The type of the input: jan, asin, rakuten_id"
bigint mall_id FK "Foreign key to malls table"
integer crawl_status "The status of the crawling (updated by SummaryProductJob)"
timestamp updated_at "Used for time constraint verification"
}
summary_wishlist_product_reviews {
bigint id PK
bigint summary_wishlist_product_id FK "Foreign key to summary_wishlist_products"
integer crawl_status "The status of the crawling (updated by SummaryProductReviewJob)"
timestamp updated_at "Used for time constraint verification"
}
summary_wishlist_categories {
bigint id PK
string category_id "The category identifier"
bigint mall_id FK "Foreign key to malls table"
integer crawl_status "The status of the crawling (updated by SummaryCategoryJob)"
timestamp updated_at "Used for time constraint verification"
}
summary_wishlist_search_queries {
bigint id PK
bigint mall_id FK "Foreign key to malls table"
string keyword "The search keyword"
integer crawl_status "The status of the crawling (updated by SummarySearchQueryJob)"
timestamp updated_at "Used for time constraint verification"
}
summary_wishlist_products ||--o| summary_wishlist_product_reviews : "has one"
Analyzer Database Tables (gb_analyzer)
erDiagram
products {
bigint id PK
bigint mall_id FK "Mall identifier (used for lookup)"
string mall_product_id "Product ID from the mall"
string jan_code "JAN code (nullable, used for lookup)"
string input_type "Type of input: asin, jan, rakuten_id (used for lookup)"
timestamp crawl_created_at "When the product was crawled (time constraint field)"
}
reviews {
bigint id PK
bigint mall_id FK "Mall identifier (used for lookup)"
string mall_product_id "Product ID from mall (used for lookup)"
string jan_code "JAN code (nullable, max 50 chars, used for lookup)"
timestamp crawl_created_at "When the review was crawled (time constraint field)"
}
t_category_rankings {
bigint id PK
bigint mall_id FK "Mall identifier (used for lookup)"
string category_id "Category identifier (used for lookup)"
timestamp crawl_created_at "When the ranking was crawled (time constraint field)"
}
t_sq_rankings {
bigint id PK
string keyword "Search keyword (used for lookup, max 100 chars)"
bigint mall_id FK "Mall identifier (used for lookup)"
timestamp crawl_created_at "When the ranking was crawled (time constraint field)"
}
Verification Lookup Keys
SummaryProduct Verification
- Console Table:
summary_wishlist_products - Analyzer Table:
products - Lookup Keys:
mall_id(integer) - Must match exactlyinput_type(string) - jan, asin, or rakuten_idinput(string) - The actual product identifier
- Time Constraint:
products.crawl_created_at >= summary_wishlist_products.updated_at - Update Field:
crawl_status = 2 (Success)
SummaryProductReview Verification
- Console Table:
summary_wishlist_product_reviews - Analyzer Table:
reviews - Lookup Keys:
mall_id(integer) - From related summary_wishlist_product- Product identifiers - Based on input_type from related product
- Time Constraint:
reviews.crawl_created_at >= summary_wishlist_product_reviews.updated_at - Update Field:
crawl_status = 2 (Success)
SummaryCategory Verification
- Console Table:
summary_wishlist_categories - Analyzer Table:
t_category_rankings - Lookup Keys:
mall_id(integer) - Must match exactlycategory_id(string) - Must match exactly
- Time Constraint:
t_category_rankings.crawl_created_at >= summary_wishlist_categories.updated_at - Update Field:
crawl_status = 2 (Success)
SummarySearchQuery Verification
- Console Table:
summary_wishlist_search_queries - Analyzer Table:
t_sq_rankings - Lookup Keys:
mall_id(integer) - Must match exactlykeyword(string) - Must match exactly
- Time Constraint:
t_sq_rankings.crawl_created_at >= summary_wishlist_search_queries.updated_at - Update Field:
crawl_status = 2 (Success)
CrawlStatus Enum Values
The crawl_status field in all Console tables uses the following enum values:
- 0 (NotCrawled): Initial state for new records
- 1 (Crawling): Record is currently being processed by crawler (filtered by sync commands)
- 2 (Success): Processing completed successfully (updated by these commands)
- 3 (Failed): Processing failed with errors
- 4 (Error): System error occurred during processing
Repository Interfaces Used
- SummaryWishlistProductRepositoryInterface: Handles
summary_wishlist_productsoperations - SummaryWishlistProductReviewRepositoryInterface: Handles
summary_wishlist_product_reviewsoperations - SummaryWishlistCategoryRepositoryInterface: Handles
summary_wishlist_categoriesoperations - SummaryWishlistSearchQueryRepositoryInterface: Handles
summary_wishlist_search_queriesoperations - ProductRepositoryInterface: Queries
productstable in Analyzer database - ReviewRepositoryInterface: Queries
reviewstable in Analyzer database - TempCategoryRankingRepositoryInterface: Queries
t_category_rankingstable in Analyzer database - TempSearchQueryRankingRepositoryInterface: Queries
t_sq_rankingstable in Analyzer database
Detail
Parameters
--data-type: Required parameter specifying the type of data to verifySummaryProduct: Verifies product data insummary_wishlist_productstableSummaryProductReview: Verifies product review data insummary_wishlist_product_reviewstableSummaryCategory: Verifies category data insummary_wishlist_categoriestableSummarySearchQuery: Verifies search query data insummary_wishlist_search_queriestable
--limit=N: Optional parameter to control batch size for processing (default: 100 records per chunk)
Frequency
Every 5 minutes for each data type (SummaryProduct, SummaryProductReview, SummaryCategory, SummarySearchQuery)
Dependencies
- Console Database (
gb_console): Must contain records in summary wishlist tables withcrawl_status = Crawling - Analyzer Database (
gb_analyzer): Must contain processed data in corresponding tables:productstable for SummaryProduct verificationreviewstable for SummaryProductReview verificationt_category_rankingstable for SummaryCategory verificationt_sq_rankingstable for SummarySearchQuery verification
- Laravel Queue System: For asynchronous job processing
- Repository Interfaces: For database abstraction and batch operations
Output
Tables
-
summary_wishlist_products: Updates
crawl_statusfield to2 (Success)- Updated by:
SummaryProductJob - Lookup method: Based on
mall_id,input_type, andinputfields - Time constraint:
crawl_created_at >= updated_at
- Updated by:
-
summary_wishlist_product_reviews: Updates
crawl_statusfield to2 (Success)- Updated by:
SummaryProductReviewJob - Lookup method: Based on related product identifiers
- Time constraint:
crawl_created_at >= updated_at
- Updated by:
-
summary_wishlist_categories: Updates
crawl_statusfield to2 (Success)- Updated by:
SummaryCategoryJob - Lookup method: Based on
mall_idandcategory_idfields - Time constraint:
crawl_created_at >= updated_at
- Updated by:
-
summary_wishlist_search_queries: Updates
crawl_statusfield to2 (Success)- Updated by:
SummarySearchQueryJob - Lookup method: Based on
mall_idandkeywordfields - Time constraint:
crawl_created_at >= updated_at
- Updated by:
Services
- Job Queue: Dispatches specialized verification jobs for each data type
- Database Transactions: Ensures atomic updates with rollback capability
- Slack Notifications: Sends success/error notifications with processing statistics
- Logging System: Records detailed execution logs with timestamps and record counts
Error Handling
Log
The system generates comprehensive logs for monitoring and debugging:
- Command Execution Logs: Start/end timestamps with data type and limit parameters
- Progress Tracking: Number of chunks processed and total records verified
- Success Updates: Count of records updated to Success status with specific IDs
- Error Details: Database errors with file/line information and retry attempts
- Time Constraint Violations: Records that don't meet timestamp requirements
Slack
Automated notifications are sent to configured Slack channels:
- Success Notifications: Include processing statistics (records processed, updated count, execution time)
- Error Notifications: Contain detailed error messages with context and stack traces
- Retry Notifications: Alert when database errors trigger retry mechanisms
- Final Failure Alerts: Sent when maximum retry attempts are exceeded
Troubleshooting
Check Data
-
Verify Console Database Records:
SELECT COUNT(*) FROM summary_wishlist_products WHERE crawl_status = 1; -- Crawling status SELECT COUNT(*) FROM summary_wishlist_products WHERE crawl_status = 2; -- Success status -
Check Analyzer Database Population:
SELECT COUNT(*) FROM products WHERE crawl_created_at >= NOW() - INTERVAL 1 HOUR; SELECT COUNT(*) FROM reviews WHERE crawl_created_at >= NOW() - INTERVAL 1 HOUR; -
Validate Time Constraints:
SELECT id, updated_at FROM summary_wishlist_products WHERE crawl_status = 1 ORDER BY updated_at DESC LIMIT 10;
Check Logs
- Laravel Application Logs: Check
storage/logs/laravel.logfor command execution details - Job Queue Logs: Monitor failed jobs using
php artisan queue:failedcommand - Database Query Logs: Enable query logging to debug slow or failing database operations
- Slack Message History: Review notification channel for error patterns and success rates
Common Issues and Solutions
-
Database Deadlocks:
- Symptom: Retry notifications in Slack
- Solution: System automatically retries with exponential backoff (up to 5 attempts)
- Prevention: Consider reducing batch size with
--limitparameter
-
No Matching Records:
- Symptom: No status updates despite records in Crawling status
- Solution: Verify Analyzer database contains recent data with proper timestamps
- Check: Ensure crawler and analyzer systems are functioning properly
-
Time Constraint Issues:
- Symptom: Records remain in Crawling status despite data in Analyzer database
- Solution: Check timestamp alignment between
updated_atin Console andcrawl_created_atin Analyzer - Fix: Verify system clocks are synchronized across databases
-
High Processing Load:
- Symptom: Command timeouts or slow execution
- Solution: Reduce batch size using
--limit=50or lower - Monitor: Check database performance and queue worker capacity