Automated extraction of wedding vendor data from Hitched.co.uk with AI-powered name extraction. This project enables the collection of vendor titles, descriptions, and primary contact names using advanced AI techniques.
Ensuring accuracy in AI-powered name extraction from unstructured text.Handling large datasets efficiently using multi-threaded scraping.Integrating AI output with traditional web-scraped data formats (CSV, JSONL).
Leveraged **Azure OpenAI via LangChain** to extract names from vendor descriptions, integrated multi-threading with **ThreadPoolExecutor** to speed up data scraping, and used **Pydantic** to ensure structured data output in both **JSONL** and **CSV** formats.
The project efficiently collected and augmented vendor data from Hitched.co.uk, enabling automated generation of enriched vendor profiles ready for further analysis or integration into marketing systems.