Crawler Integration - Create Configurations
Command Signatures
php artisan plg-api:sending-configs-to-crawler --mode=create --data-type=SummaryProduct [--limit=100]
php artisan plg-api:sending-configs-to-crawler --mode=create --data-type=SummaryProductReview [--limit=100]
php artisan plg-api:sending-configs-to-crawler --mode=create --data-type=SummaryCategory [--limit=100]
php artisan plg-api:sending-configs-to-crawler --mode=create --data-type=SummarySearchQuery [--limit=100]
Purpose
These commands identify records in the summary wishlist tables that require new crawler configurations and send the necessary configuration data to the Crawler system via the Playground API. Each data type has its own configuration format and is processed separately to maintain efficient crawling operations.
Sequence Diagram
sequenceDiagram
participant System
participant Command as plg-api:sending-configs-to-crawler (create)
participant Repository as SummaryWishlist*Repository
participant Job as Summary*Job
participant APIService as PlaygroundApiService
participant Crawler as Crawler System
participant Logger
participant Slack
Note over System,Slack: Crawler Configuration Creation Flow (Every 5 Minutes)
rect rgb(200, 255, 200)
Note right of System: Happy Case - Normal Processing
System->>Command: Execute with specific data type
Command->>Logger: Log command start
Command->>Command: Validate mode and data type parameters
Command->>Repository: chunkDataSendToCreate()
Repository->>Repository: Query records WHERE sending_status = NotSent OR Error
Repository-->>Command: Return chunk of records (max: limit option)
rect rgb(200, 230, 255)
alt Records Found
Note right of Command: Job Processing
Command->>Job: dispatch(records)
Job->>Job: mapRecordsToData()
Job->>APIService: bulkCreate(mapped data)
APIService->>Crawler: HTTP POST /bulk-create
Crawler-->>APIService: Response with created configs
APIService-->>Job: Return API response
rect rgb(230, 200, 255)
alt Success Response (201 Created)
Note right of Job: Success Processing
Job->>Repository: Update sending_status to Sent
Job->>Repository: Update crawl_config_id from response
Job->>Repository: Update crawl_status to NotCrawled
Job->>Logger: Log success with statistics
Job->>Slack: Send success notification
else Bad Request (400)
Note right of Job: Error Processing
Job->>Repository: Update sending_status to Error
Job->>Logger: Log error details
Job->>Slack: Send error notification
end
end
else No Records
Note right of Command: No Data Scenario
Command->>Logger: Log no records to process
end
end
end
rect rgb(255, 200, 200)
Note right of System: Error Handling
rect rgb(255, 230, 230)
alt Unexpected Error Occurs
Command->>Logger: Log error details
Command->>Slack: Send error notification with context
end
end
end
Detail
Parameters
--mode=create: Required parameter specifying that new configurations should be created--data-type: Required parameter specifying the type of data for which to create configurationsSummaryProduct: Product summary dataSummaryProductReview: Product review dataSummaryCategory: Category summary dataSummarySearchQuery: Search query summary data
--limit=N: Optional parameter to control the chunk size (default: 100)
Frequency
Every 5 minutes for each data type
Dependencies
- Summary wishlist tables must contain records with sending_status = NotSent or Error
- Playground API service must be accessible
- Valid API authentication tokens
Output
Tables
summary_wishlist_products: Updates sending_status, crawl_config_id, crawl_status- sending_status: Changes from NotSent/Error to Sent
- crawl_config_id: Populated with ID from Crawler response
- crawl_status: Set to NotCrawled for successful creations
summary_wishlist_product_reviews: Same field updates as productssummary_wishlist_categories: Same field updates as productssummary_wishlist_search_queries: Same field updates as products
Services
- Playground API: Receives bulk create requests with crawler configurations
- Crawler System: Creates new crawl configurations based on summary data
Database Schema
erDiagram
summary_wishlist_products {
bigint id PK
string input "The input of the product"
string input_type "The type of the input: jan, asin, rakuten_id"
bigint mall_id FK "Foreign key to malls table"
integer schedule_id "The id of the schedule"
integer schedule_priority "The priority of the schedule"
integer sending_status "The status of the sending to crawler"
bigint crawl_config_id "The id of the configs table from Crawler (nullable)"
integer status "The status of the product"
}
summary_wishlist_product_reviews {
bigint id PK
bigint summary_wishlist_product_id FK "Foreign key to summary_wishlist_products (unique)"
integer schedule_id "The id of the schedule"
integer schedule_priority "The priority of the schedule"
integer sending_status "The status of the sending to crawler"
bigint crawl_config_id "The id of the configs table from Crawler (nullable)"
integer status "The status of the product"
}
summary_wishlist_categories {
bigint id PK
string category_id "The id of the category in the mall"
bigint mall_id FK "Foreign key to malls table"
integer schedule_id "The id of the schedule"
integer schedule_priority "The priority of the schedule"
integer sending_status "The status of the sending to crawler"
bigint crawl_config_id "The id of the configs table from Crawler (nullable)"
integer status "The status of the product"
}
summary_wishlist_search_queries {
bigint id PK
bigint mall_id FK "The id of the mall"
string keyword "The keyword to search"
integer schedule_id "The id of the schedule"
integer schedule_priority "The priority of the schedule"
integer sending_status "The status of the sending to crawler"
bigint crawl_config_id "The id of the configs table from Crawler (nullable)"
integer status "The status of the product"
}
%% Relationships
summary_wishlist_products ||--o{ summary_wishlist_product_reviews : "has reviews"
Error Handling
Log
- Command execution start/end with data type and parameters
- Success/failure of API calls with response codes
- Record counts and batch processing information
- Detailed error messages with file and line information for debugging
Slack
- Success notifications with data type and processing statistics (records processed, configs created)
- Error notifications with detailed message and source information
- Full error context including API response details and affected record counts
Troubleshooting
Check Data
- Verify summary_wishlist_* tables contain records with sending_status = NotSent or Error
- Check that records have valid schedule_id and schedule_priority values
- Ensure mall_id references exist in the malls table
- Validate input_type values for products (jan, asin, rakuten_id)
Check Logs
- Monitor command execution logs for successful starts and completions
- Check API response logs for HTTP status codes and error messages
- Review Slack notifications for success/failure patterns
- Examine job queue logs for processing delays or failures
- Verify database update logs show proper status transitions