Download resume dataset extracted from Indeed.com
US resume dataset on DataStock with ~8M records

Here’s a pre-crawled sample dataset from Indeed.com, a popular technology-driven job board based out of United States.  This dataset contains details of resumes from US-based candidates.

Possible insights from the analysis of this dataset has been mentioned below:

  • Train machine learning algorithm for job recommendation engine based on candidate profile.
  • Identifying skill sets that are highly available in a given location based on zip code.
  • Predicting possible professional alternatives based on skills and user profiles.  
  •  Maintaining a repository for resume and identifying education level and work experience of job aspirants.

The Indeed.com dataset contains following fields:

To access the complete dataset with nearly 8M resumes, click below!

  • pageurl: Source URL for resume data extraction.
  • uniq_id:A unique identifier assigned to every entry in the dataset.
  • url: Resume or profile URL.
  • zipcode:Zip code of the location of the profile.
  • name: Designation or job profile attached with the resume.
  • raw_html:Complete resume data with along with HTML tags.
  • pdf_download_link:Download link for the resume.