In experimental studies on query languages, subjects are required to write
queries using different query languages. User query performance is usually
measured by query accuracy. There is no clearly defined objective method of
applying findings to other queries. This study examines the suitability of
using a software metric based on lines of code to estimate user query accu
racy. Lines of code have been measured in various ways, such as physical so
urce code lines, logical source code lines or compiled bytes. A method of c
ounting lines of code for database queries is proposed and applied to two q
uery languages. The new method counts Boolean conditions as well as other s
tatements. The relationship between lines of code and user query accuracy w
as examined with regression models. The results show that lines of code can
explain a high percentage of the variance in accuracy, with R-2 > 0.8 for
the standard relational model query language SQL, and R-2 > 0.9 for the ent
ity relationship model query language KQL. The common assumption that more
lines of code will lead to lower accuracy is only partly validated. The fin
dings show a nonlinear relationship, with a possible recovery in accuracy f
or queries with many lines of code. The results indicate that lines of code
can be usefully applied in the study of query languages. (C) 1999 Academic
Press.