Analyzing Linux on a Supercomputer

The C preprocessor, a key element of the language, has become a liability due to its lack of integration with modern language semantics. This column describes the analysis of the C preprocessor usage in the Linux kernel, comprising 20 million lines of code, using the CScout refactoring browser. Proc...

Full description

Saved in:
Bibliographic Details
Published inIEEE software Vol. 42; no. 2; pp. 18 - 23
Main Author Spinellis, Diomidis
Format Journal Article
LanguageEnglish
Published Los Alamitos IEEE 01.03.2025
IEEE Computer Society
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The C preprocessor, a key element of the language, has become a liability due to its lack of integration with modern language semantics. This column describes the analysis of the C preprocessor usage in the Linux kernel, comprising 20 million lines of code, using the CScout refactoring browser. Processing limitations led to a solution leveraging a supercomputer’s parallel processing capabilities. The analysis divided the kernel’s source files across 32 supercomputer nodes and implemented a binary tournament database merging strategy. Initial efforts revealed multiple difficulties. Resolving them involved several false starts involving recursive SQL statements, an SQLite extension, and the GraphViz connected components tool. After a number of redesigns guided by stress-testing, the analysis finished in just 32 hours rather than a week, using 374 CPU hours and 640 GiB RAM on the supercomputer’s nodes.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0740-7459
1937-4194
DOI:10.1109/MS.2024.3512732