Peter Jasco, writing in Library Journal, does for the metadata in Google Scholar what Geoffrey Nunberg did for Google Books:
False names are created from options on the seach menu, such as P Options (for Payment Options); from parts of the author affiliation (CA San Diego, C Ltd, M View for Mountain View); from Table of Contents pages on publishers’ web sites; and from section headings of articles (B Methods, D Definitions, G Assessment, H Variables, I Evaluation. (The initial varies depending on the section identifying letter or Roman numeral.)
The article is scathing on the quality of Google’s parsers, and argues that Google should be more reliant on the high-quality metadata now being made available by journal publishers.
The press and the public were so enamored of anything with the word Google in it that GS developers apparently believed they could create a parser to identify the metadata better than the human indexers at the publishers, repositories, and indexing/abstracting services who assigned metadata by listing author, title, journal name, publication year, and other metadata elements.
But note that this is the opposite of the problem with Google Books, where Jon Orwant’s response to Nunberg put much of the blame on bad metadata supplied to Google by outside cataloguers. My instant reaction is that the situation can’t be as black-and-white as Jasco claims; I’d like to know more about the data sources made available, the terms, and the history.