How Resume Parsing Differs from Data Extraction (KeyWord Searching) Or There’s More to the Resume than Meets the Eye (If it’s been Parsed)

At some point, the volume of resumes received by an organization exceeds its ability to manually process them effectively. The cost of cutting and pasting information manually from a resume into an applicant tracking system (ATS) far exceeds the cost of: a) storing the resume in some consistent format hopefully making it ‘searchable’, or b) having the resume parsed by a parsing tool or service thereby creating tagged data.

Given the expense of installing, training, and maintaining an ATS as well as acquiring the resumes, it only makes sense to make the best use of the ‘big data’ available in the resumes. So how does making the resume searchable via data extraction compare to having the resume parsed?

Remember, as parsers are probabilistic tools, perfection remains an elusive goal. As resumes are highly unstructured documents, with ever-changing skill sets (competencies), job titles, companies, etc., no piece of software could work ‘perfectly’. At the same time, a human cutting and pasting a resume into an ATS would similarly miss some pieces of relevant information. At the same time, ‘searching’, with even a sophisticated knowledge of Boolean string construction, will not necessarily give you the results you seek when combing the data base (ATS).

For example, if your ATS is only key-word searchable, and you search for someone specifically with six or more years of JAVA experience, you may get everyone with JAVA in your ATS. An ATS with resume parsing implemented, however, would be able to return only those JAVA candidates with six or more years of experience. Why? A proper resume parser will derive information on when a skill was last used and for how many years the skill was used. The parser will then walk through all job descriptions and if the skill is mentioned, the start and end dates for those positions are used to calculate the total number of years of experience the person had in those skills. Even the most recent end date of a position where the skill is mentioned becomes ‘Year Last Used’ for that skill.

Similarly, a well-trained parser will ‘term-validate’. Term Validation associates various permutations of skills and job titles with a standardized, “valid” term from the parser’s taxonomy of skills and replaces “raw” terms with “valid” skills and job titles. If a resume mentions “Windows”, it is standardized into “Microsoft Windows”; a job title of “Soft. Eng.” or “S/E” (yes, people do this) will become “Software Engineer”. You wouldn’t want to miss a candidate because they wrote ‘S/E’, would you?

Another case where you’re looking for more than what’s printed (or searchable) on the page would be hierarchies. Hierarchies identify skills that are not specifically listed in a resume. For example, AIX, Solaris, HP-UX, Linux, Ultrix, and others are children of the parent UNIX. With hierarchies as a feature enabled, the parser would return “UNIX” as a skill when one of the children of UNIX was identified (for example, AIX). Skills and job titles can have multiple parents and multiple levels of parents. For example, the skill ABAP has the parent SAP, which in turn has the parent ERP. So all 3 skills (ABAP, SAP, ERP) will be returned in the XML string with a hierarchies feature enabled.

As you can see, with tagged data from a parser, your search can be more intelligently defined, and consequently, the results (in terms of resumes returned by the ATS) much more narrow. Instead of receiving 30 resumes, you get three. And once again, the time saved is the true value proposition of resume parsing. Whether it’s on the data entry side, or the data results side, resume parsing is distinctly different from data extraction and has a ROI far in excess of its cost. Does your ATS parse resumes?

Comments are closed.

Free Trial

Our FREE CV / Résumé and Job Order / Job Description parsing trial includes 20 parses and is valid for 30 days.

Request a Free Trial!

Ask Us About:

  • Semantic Searching and Matching capabilities

  • Batch & email processing

  • Language support

  • Customization

  • OCR capabilities

Contact Info

+1 (603) 432 6653

+1 (603) 432 6653