Crawler Integration - Update Configurations
Command Signatures
php artisan plg-api:sending-configs-to-crawler --mode=update --data-type=SummaryProduct [--limit=100]
php artisan plg-api:sending-configs-to-crawler --mode=update --data-type=SummaryProductReview [--limit=100]
php artisan plg-api:sending-configs-to-crawler --mode=update --data-type=SummaryCategory [--limit=100]
php artisan plg-api:sending-configs-to-crawler --mode=update --data-type=SummarySearchQuery [--limit=100]
Purpose
These commands update existing crawl configurations in the Crawler system for different data types. They identify records in the summary wishlist tables that have been modified and require updated crawler configurations, then send the modified configuration data to the Crawler system via the Playground API.
Sequence Diagram
sequenceDiagram
participant System
participant Command as plg-api:sending-configs-to-crawler (update)
participant Repository as SummaryWishlist*Repository
participant Job as Summary*Job
participant APIService as PlaygroundApiService
participant Crawler as Crawler System
participant Logger
participant Slack
Note over System,Slack: Crawler Configuration Update Flow (Every 5 Minutes)
rect rgb(200, 255, 200)
Note right of System: Happy Case - Normal Processing
System->>Command: Execute with specific data type
Command->>Logger: Log command start
Command->>Command: Validate mode and data type parameters
Command->>Repository: chunkDataSendToUpdate()
Repository->>Repository: Query records WHERE sending_status = Sent AND needs update
Repository-->>Command: Return chunk of records (max: limit option)
rect rgb(200, 230, 255)
alt Records Found
Note right of Command: Job Processing
Command->>Job: dispatch(records)
Job->>Job: mapRecordsToData()
Job->>APIService: bulkUpdate(mapped data)
APIService->>Crawler: HTTP PUT /bulk-update
Crawler-->>APIService: Response with updated configs
APIService-->>Job: Return API response
rect rgb(230, 200, 255)
alt Success Response (200 OK)
Note right of Job: Success Processing
Job->>Repository: Update sending_status to Sent
Job->>Repository: Update crawl_config_id if changed
Job->>Repository: Update crawl_status if needed
Job->>Logger: Log success with statistics
Job->>Slack: Send success notification
else Bad Request (400)
Note right of Job: Error Processing
Job->>Repository: Update sending_status to Error
Job->>Logger: Log error details
Job->>Slack: Send error notification
end
end
else No Records
Note right of Command: No Data Scenario
Command->>Logger: Log no records to process
end
end
end
rect rgb(255, 200, 200)
Note right of System: Error Handling
rect rgb(255, 230, 230)
alt Unexpected Error Occurs
Command->>Logger: Log error details
Command->>Slack: Send error notification with context
end
end
end
Detail
Parameters
--mode=update: Required parameter specifying that existing configurations should be updated--data-type: Required parameter specifying the type of data for which to update configurationsSummaryProduct: Product summary dataSummaryProductReview: Product review dataSummaryCategory: Category summary dataSummarySearchQuery: Search query summary data
--limit=N: Optional parameter to control the chunk size (default: 100)
Frequency
Every 5 minutes for each data type
Dependencies
- Summary wishlist tables must contain records with sending_status = Sent that have been modified
- Playground API service must be accessible
- Valid API authentication tokens
- Existing crawl_config_id values in the database
Output
Tables
summary_wishlist_products: Updates sending_status, crawl_config_id, crawl_status- sending_status: Remains Sent for successful updates, changes to Error for failures
- crawl_config_id: May be updated if Crawler returns new ID
- crawl_status: Updated based on Crawler response
summary_wishlist_product_reviews: Same field updates as productssummary_wishlist_categories: Same field updates as productssummary_wishlist_search_queries: Same field updates as products
Services
- Playground API: Receives bulk update requests with modified crawler configurations
- Crawler System: Updates existing crawl configurations based on summary data changes
Database Schema
erDiagram
summary_wishlist_products {
bigint id PK
string input "The input of the product"
string input_type "The type of the input: jan, asin, rakuten_id"
bigint mall_id FK "Foreign key to malls table"
integer schedule_id "The id of the schedule"
integer schedule_priority "The priority of the schedule"
integer sending_status "The status of the sending to crawler"
bigint crawl_config_id "The id of the configs table from Crawler (nullable)"
integer status "The status of the product"
}
summary_wishlist_product_reviews {
bigint id PK
bigint summary_wishlist_product_id FK "Foreign key to summary_wishlist_products (unique)"
integer schedule_id "The id of the schedule"
integer schedule_priority "The priority of the schedule"
integer sending_status "The status of the sending to crawler"
bigint crawl_config_id "The id of the configs table from Crawler (nullable)"
integer status "The status of the product"
}
summary_wishlist_categories {
bigint id PK
string category_id "The id of the category in the mall"
bigint mall_id FK "Foreign key to malls table"
integer schedule_id "The id of the schedule"
integer schedule_priority "The priority of the schedule"
integer sending_status "The status of the sending to crawler"
bigint crawl_config_id "The id of the configs table from Crawler (nullable)"
integer status "The status of the product"
}
summary_wishlist_search_queries {
bigint id PK
bigint mall_id FK "The id of the mall"
string keyword "The keyword to search"
integer schedule_id "The id of the schedule"
integer schedule_priority "The priority of the schedule"
integer sending_status "The status of the sending to crawler"
bigint crawl_config_id "The id of the configs table from Crawler (nullable)"
integer status "The status of the product"
}
%% Relationships
summary_wishlist_products ||--o{ summary_wishlist_product_reviews : "has reviews"
Error Handling
Log
- Command execution start/end with data type and parameters
- Success/failure of API calls with response codes
- Record counts and batch processing information
- Detailed error messages with file and line information for debugging
Slack
- Success notifications with data type and processing statistics (records processed, configs updated)
- Error notifications with detailed message and source information
- Full error context including API response details and affected record counts
Troubleshooting
Check Data
- Verify summary_wishlist_* tables contain records with sending_status = Sent that have been modified
- Check that records have valid crawl_config_id values from previous create operations
- Ensure schedule_id and schedule_priority values are valid
- Validate that updated_at timestamps indicate recent modifications
Check Logs
- Monitor command execution logs for successful starts and completions
- Check API response logs for HTTP status codes and error messages
- Review Slack notifications for success/failure patterns
- Examine job queue logs for processing delays or failures
- Verify database update logs show proper status transitions
- Compare before/after configuration data to confirm updates were applied