Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Data-driven prototyping via natural-language-based GUI retrieval
Kolthoff K., Bartelt C., Ponzetto S. Automated Software Engineering30 (1):2023.Type:Article
Date Reviewed: Mar 7 2024

Reading this paper would leave anyone conflicted. It describes a good contribution ridden with flaws. It is puzzling that the referees and journal editors accepted the paper in its current form. The presentation style is verbose, and some of the long sentences read like word-for-word translations from another language. The technical material is described informally, and one needs to consult supplementary material--for example, various documents including notebooks, data, code, and evaluation ( order to separate the research contributions from several claims. A recurring claim, and the main appeal of this research, is “natural language requirements” elicitation. First, in software engineering, users’ needs become requirements only after translation by requirements engineers. Second, browsing through the file “gold standard” (another critical claim) in the supplementary material reveals the bare simplicity of the “requirements”: each entry in the file is a phrase or a short sentence.

The authors describe the stages they undertook to implement a graphical user interface (GUI) prototyping system, their evaluation of the system, the retrieval methods they used, the gold standard they developed, and user experience with the system. The first stage is the construction of a curated GUI repository from the Rico GUI dataset (note that the provided link is broken). The curation process uses filtering criteria that include limiting application types to business and utilities, selecting only English-based GUIs, and removing erroneous cases. As part of this process, a textual representation of each GUI was carried out by extracting text from metadata, XPath expressions, and icon labels. Even though the claim is just English-based applications, while trying the prototype application, some of the results of my queries were in languages other than English (Spanish, Vietnamese).

The second stage is the GUI retrieval implementation. A user requirement in natural language is processed using the typical natural language processing (NLP) pipeline to generate a query. Given the queries in the gold standard, the NLP involved is trivial. Three different information retrieval methods (TF-IDF, BM25, and nBOW) are used to retrieve ranked GUIs satisfying the query. These results are used to perform automatic query expansion based on the Kullback-Leibler divergence model. Moreover, a BERT-based learning to rank (LTR) model is adopted to rank the retrieved GUIs. The authors state that they fine-tuned an existing BERT model without providing any further details.

The third stage is implementation of the prototype system ( The main window consists of three views: 1. Search GUIs, 2. Create Prototype, and 3. Explore Preview. Only “Search GUIs” is functional in the present version. The authors do not provide any documentation or user’s guide, and the explanations given in the text are not consistent with the implementation.

The final stage is the experimental evaluation. The following three research questions are posed:

RQ1: Which retrieval method performs best for GUI retrieval on the basis of NL search queries?
RQ2: Does RaWi increase the GUI prototyping productivity compared to a traditional prototyping tool?
RQ3: Do users perceive RaWi as useful for rapid high-fidelity GUI prototyping?

Addressing RQ1 requires a set of high-quality queries. The authors developed what is termed a “gold standard” of queries. The process uses Amazon Mechanical Turks to generate queries for a GUI subset from the GUI repository. These queries are then filtered in more or less formal ways to obtain the gold standard. RQ2 uses a controlled experiment with 19 subjects to assess the relative productivity of the prototyping system. Based on their background and related experience, these subjects form a representative group. The participants were tasked with building two GUIs, one using the prototyping system and the other one using a traditional approach (left undefined). Snapshots of each participant’s progress were taken at regular intervals and subsequently analyzed. A usability study was conducted to assess RQ3. The participants answered several questions using a five-point Likert scale. A statistical analysis was performed for each of the research questions and the results are summarized in various tables.

Reviewer:  B. Belkhouche Review #: CR147723
Bookmark and Share
  Featured Reviewer  
Software Development (K.6.3 ... )
General (D.2.0 )
Would you recommend this review?
Other reviews under "Software Development": Date
Strategies for software engineering
Ould M., John Wiley & Sons, Inc., New York, NY, 1990. Type: Book (9780471926283)
Oct 1 1991
Applications strategies for risk analysis
Charette R., Intertext Pubs./McGraw-Hill Book Co., New York, NY, 1990. Type: Book (9780070108882)
Aug 1 1992
A survey of exploratory software development
Trenouth J. The Computer Journal 34(2): 153-163, 1991. Type: Article
Nov 1 1991

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy