Skip to main navigation Skip to search Skip to main content

An analysis of programming language statement frequency in C, C++, and Java source code

  • Xiaoyan Zhu
  • , E. James Whitehead
  • , Caitlin Sadowski
  • , Qinbao Song
  • University of California at Santa Cruz
  • Xi'an Jiaotong University

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Statement frequency data can inform programming language research and provide a solid basis for frequency-based code analysis. This paper presents an analysis of programming language statement frequency in a large corpus of C, C++, and Java source code, comprised of more than 54 million lines of code. Across these languages, the top four work-performing statement types are Method/Function Call, Assignment, If, and Return. As compared to studies of Formula Translating System, Common Business Oriented Language and Programming Language One in the 1970s, the main change is the prevalence of method/function calls. Statement use frequency across languages is remarkably similar, and within each individual language, most statement types have a frequency distribution that occupies a small range. A more detailed examination of assignment and looping statement types shows that many assignments simply involve copying of data and that C++/Java use for statements more than C.

Original languageEnglish
Pages (from-to)1479-1495
Number of pages17
JournalSoftware - Practice and Experience
Volume45
Issue number11
DOIs
StatePublished - Nov 2015

Keywords

  • metrics
  • source code
  • statement frequency

Fingerprint

Dive into the research topics of 'An analysis of programming language statement frequency in C, C++, and Java source code'. Together they form a unique fingerprint.

Cite this