An E®ort Estimation Taxonomy for Agile Software Development

In Agile Software Development (ASD) e®ort estimation plays an important role during release and iteration planning. The state of the art and practice on e®ort estimation in ASD have been recently identi¯ed. However, this knowledge has not yet been organized. The aim of this study is twofold: (1) To organize the knowledge on e®ort estimation in ASD and (2) to use this organized knowledge to support practice and the future research on e®ort estimation in ASD. We applied a taxonomy design method to organize the identi¯ed knowledge as a taxonomy of e®ort estimation in ASD. The proposed taxonomy o®ers a faceted classi¯cation scheme to characterize estimation activities of agile projects. Our agile estimation taxonomy consists of four dimensions: estimation context, estimation technique, e®ort predictors and e®ort estimate. Each dimension in turn has several facets. We applied the taxonomy to characterize estimation activities of 10 agile projects identi¯ed from the literature to assess whether all important estimation-related aspects are reported. The results showed that studies do not report complete information related to estimation. The taxonomy was also used to characterize the estimation activities of four agile teams from three di®erent software companies. The practitioners involved in the investigation found the taxonomy useful in characterizing and documenting the estimation sessions.


Introduction
Software is developed incrementally and iteratively in Agile Software Development (ASD). Like other activities in ASD, planning and estimation are also performed iteratively at various stages such as during release or sprint planning. Estimation in ASD has been investigated in a number of studies. We have recently aggregated these works in a Systematic Literature Review (SLR) [1]. In order to get a more broader understanding on how e®ort estimation is practiced by agile teams, we decided to carry out a follow-up study to elicit the state of the practice on e®ort estimation in ASD. The SLR included only peer-reviewed empirical works as primary studies, and did not include practitioners' opinion expressed in white papers, blogs, forums, etc. The follow-up study [2], on the other hand, directly elicited opinion of agile practitioners on e®ort estimation in ASD to complement the¯ndings of the SLR. These studies identi¯ed and aggregated knowledge on e®ort estimation in ASD from the literature (SLR) and the industry (survey). The knowledge includes aspects such as techniques used to estimate e®ort in ASD, employed size measures and other cost drivers and the context in which estimation is performed in ASD.
We believe that this body of knowledge on e®ort estimation in ASD needs to be organized to facilitate both future research and practice in this area. Taxonomies have been used to organize the body of knowledge in software engineering (SE) [3] and in other disciplines [4] as well, and have played an important role in maturing these¯elds. The concept of taxonomy was originally used by the Swedish scientist Carl Linnaeus in the eighteenth century wherein he used hierarchical structure to classify the nature (animals, plants and minerals) based on common physical characteristics [5]. Taxonomy mainly is a classi¯cation mechanism, and is de¯ned as follows in two well-known dictionaries: . The Cambridge dictionary a de¯nes taxonomy as \a system for naming and organizing things, especially plants and animals, into groups that share similar qualities." . The Oxford dictionaries b de¯ne taxonomy as \The classi¯cation of something, especially organisms" or \A scheme of classi¯cation." Classi¯cations o®er both theoretical and practical advantages. On one side they serve as a tool to explore the relationships between entities and identify gaps in the existing knowledge, while on the other side they can support the selection of appropriate entities in a real scenario [6]. In this work we propose a taxonomy to classify and characterize the e®ort estimation activities of software projects conducted in an ASD context. Few classi¯cations of e®ort estimation techniques have been proposed in the literature. They are brie°y described in Sec. 2. These classi¯cations are about estimation techniques only. Our taxonomy, on the other hand, covers the whole estimation activity that also includes, besides techniques, e®ort predictors, estimation context and e®ort estimates. Following are the contributions of the proposed taxonomy: . It organizes the body of knowledge on e®ort estimation in ASD. . It provides a mechanism to classify and characterize the e®ort estimation in ASD.
The taxonomy dimensions can be used by researchers to assess and improve the reporting of research on e®ort estimation in ASD. . It supports e®ort estimation practice in ASD. The taxonomy dimensions can be used to document important estimation-related information and thereby helping in externalizing the otherwise tacit knowledge of estimation sessions. The taxonomy can also help the agile practitioners during estimation sessions by reminding them of the important aspects that should be considered when e®ort is being estimated.
The taxonomy was designed by following a taxonomy design method, which is brie°y described in Sec. 3. The taxonomy was used to characterize 10 estimation cases reported in the literature. The usefulness of the taxonomy was also demonstrated by classifying and characterizing the estimation activities of¯ve agile teams.
The rest of the paper is organized as follows: Section 2 describes the related work. Section 3 details the research questions (RQs) and methods used to design and evaluate our agile estimation taxonomy. Section 4 presents the results. Section 5 compares the proposed taxonomy with existing work and also discusses the validity threats. Finally, Sec. 6 concludes the paper.

Related Work
In this section we describe the background works that lead to the design of the proposed taxonomy, and existing classi¯cations of e®ort estimation techniques.

State of the art and practice on e®ort estimation in ASD
We conducted an SLR [1] with the aim to identify the state of the art on e®ort estimation in ASD. The SLR was performed by following Kitchenham and Charters guidelines [7]. The SLR reported the results based on 20 papers covering 25 studies on agile e®ort estimation. The details relating to the planning, execution and analysis of SLR results are described in [1]. To complement the¯ndings of SLR, we carried out a follow-up study [2] to identify state of the practice on e®ort estimation in ASD. As the data and results of these two empirical studies form the basis for the agile estimation taxonomy presented in Sec. 4, we brie°y present and compare the results of these two empirical studies in this section covering the context, estimation techniques and e®ort predictors used to estimate e®ort in ASD. Both empirical studies identi¯ed various contextual factors around which agile teams estimate e®ort. In this sub-subsection, we present the¯ndings of the two studies on three contextual factors: planning level, development activities and agile methods. The results of the SLR and survey show that estimation is mainly performed during release and iteration planning levels in ASD. However, many primary studies (48%) in our SLR did not explicitly specify the planning level at which estimation was performed. The survey results showed that some agile teams (16.67%) estimate e®ort during daily meetings as well. A handful of the agile practitioners in the survey (13.33%) stated that their teams estimate e®ort at project bidding level as well. The results are summarized in Table 1.
The SLR and the survey also investigated the scope of the e®ort estimates with respect to the development activities, i.e. which development activities are accounted for in the e®ort estimates. Implementation and the testing were found to be the main estimated activities in both studies. In 36% of the primary studies of SLR, all development activities were accounted for in e®ort estimates. About 24% of the primary studies did not report this information. The results are listed in Table 2.
Scrum and eXtreme Programming (XP) were identi¯ed as the leading agile methods in the SLR studies and the survey responses. It was interesting to note that the primary studies of SLR used only individual agile method (Scrum or XP). However, a large number of the survey respondents cited using a combination of the methods. Kanban, for example, is used mostly in combination with Scrum. These results are summarized in Table 3. Some other contextual factors were also elicited in the SLR and the survey. These include, besides other factors, the team size, the project/product domain and the team setting. In most cases the team size was found to be less than 10. Agile teams in 80% of the SLR's primary studies were co-located, while remaining 20% of the SLR studies included geographically distributed teams. In the case of survey, all respondents stated that they are currently part of co-located teams.

Estimation technique
The results of SLR and the survey showed that the estimation techniques (see Table 4) that rely on the subjective decision of the practitioners are used to estimate e®ort in ASD context. Most frequently used techniques are expert judgment, planning poker and use case points (UCP) estimation method. These¯ndings are in line with the agile philosophy wherein people are valued more than processes, and the heavyweight models and processes are not preferred. E®ort estimation in ASD is mostly a group activity, i.e. the e®ort is estimated by the team.
The primary studies in SLR used various measures such as Mean Magnitude of Relative Error (MMRE) to calculate the accuracy of the estimates. However according to the survey results, these measures are not very common in the industry. For most practitioners a simple comparison of the estimated and the actual e®orts is su±cient. The techniques achieved varying degrees of accuracy in the primary studies of SLR, while the survey results indicated that there is a tendency to underestimate the e®ort (42% of the respondents believe that estimates are underestimated by 25% or more).

E®ort predictors
The product size and the cost drivers constitute e®ort predictors. The product size is believed to be an important e®ort predictor. Di®erent metrics (e.g. function points) have been used to measure product size. Our survey and the SLR results (see Table 5) showed that story points, function and use case points are the most frequently used size measures during e®ort estimation in ASD.
With regard to the cost drivers, the survey and the SLR results identi¯ed that the team-related cost drivers (see Table 6) play a signi¯cant role in the e®ort estimation in ASD. This aligns well with values of the agile development wherein the individuals and the teams are valued higher than the processes and the tools. Other cost drivers that are also considered important are nonfunctional requirements (NFRs; e.g. performance, security), project/product domain and customer communication.

2.2.
Recent studies on e®ort estimation in ASD Section 2.1 presented a summary of review on the state of the art and practice on e®ort estimation in ASD till 2014. In this subsection, we discuss some recently reported studies on e®ort estimation in ASD.  Garg and Gupta [8] proposed a new cost estimation model for agile software projects. Using project characteristics as input, the proposed model used Principal Component Analysis (PCA) to identify a reduced set of key characteristics that are relevant for the development cost. Constraint solving approach is then employed to ensure the ful¯llment of the agile manifesto criteria. The model is applied on agile projects dataset collected from multiple software companies in di®erent countries. The authors noted that the proposed model demonstrated better estimation accuracy in terms of MMRE compared to the other approaches such as planning poker.
Continuous changes in user stories (USs) in ASD lead to uncertainty, which is one of the main reasons for inaccurate time and cost estimates [9]. Popli and Chauahn [9] proposed a technique to manage this uncertainty in ASD to improve estimation. In the proposed technique the user stories are¯rst divided into substories, and then for each of the substories three types of story points are calculated: fastest, practical and maximum. These three estimates are then averaged to compute the¯nal estimated story points. The authors illustrated the proposed technique using a small example project. Zahraoui and Idrissi [10] proposed adjustments in story points measure to improve e®ort estimation in ASD. The new adjusted story points are calculated by adjusting the story points with the priority, size and complexity of the user stories being estimated.
Lenarduzzi et al. [11] reported a replicated case study on functional size measures and e®ort estimation in Scrum. The case study consisted of a small web-based application developed in ASP.net by four part-time developers using Scrum as development process. The results of the study showed that the e®ort estimation by developers is more accurate than estimation through functional size measures. The study concluded that the introduction of functional size measures does not help in improving accuracy of the e®ort estimates in Scrum.
Tanveer et al. [12] performed an industrial case study to understand and improve e®ort estimation in ASD. The results showed that the accuracy of the e®ort estimation is a®ected by factors such as developers' knowledge and experience, and the complexity of the required changes in the system. These factors should be considered by developers during estimation process. They also noted that the use of tool to support the impact analysis can also improve e®ort estimation. The use of tool would aid practitioners in explicitly considering important factors during e®ort estimation process.
Ramirez-Noriega et al. [13] proposed a technique for the identi¯cation and validation of the factors that teams use to decide the complexity and importance of the tasks being estimated using planning poker in Scrum. The approach then uses Bayesian Networks (BNs), as knowledge structure, to relate these factors with each other to improve the accuracy of the estimates.

Classi¯cations of e®ort estimation techniques
Many di®erent classi¯cations schemes have been proposed for e®ort or cost estimation techniques during the last 30 years or so. Boehm [14] presented seven classes of An E®ort Estimation Taxonomy for ASD 647 the estimation techniques. These were: algorithmic models, analogy, expert judgment, Parkinson, price to win, top-down and bottom-up. Briand and Wieczorek [15] presented a hierarchical classi¯cation of the estimation techniques. At the top are the two main categories: model-based and nonmodel-based techniques. The modelbased techniques are further divided into generic and speci¯c model-based techniques. Mendes [16], on the other hand, divided the e®ort estimation techniques in three broad categories: algorithmic models, expert-based and arti¯cial intelligence techniques.
A more detailed and recent classi¯cation of e®ort estimation techniques is proposed by Trendowicz and Je®ery (TJ) [17]. E®ort estimation techniques are divided into three top categories (data-driven, expert-based and hybrid) that are further divided into subcategories. Data-driven techniques, for instance, are divided into proprietary and nonproprietary techniques. Model-based, memory-based and composite are the subcategories of the non-proprietary techniques.
The work presented in this study is di®erent from these classi¯cations. We are not proposing a new classi¯cation of e®ort estimation techniques. Our proposed taxonomy covers the whole e®ort estimation activity in ASD, whereby the actual e®ort estimation technique is only one facet of the proposed taxonomy. The facets of the taxonomy are described in Sec. 4. Since the recent work by Trendowicz and Je®ery [17] is very comprehensive and also covers other aspects besides estimation techniques, we will come back to their work in Sec. 5.

Research Methodology
In this section, we describe the research questions and methods used to design and evaluate the proposed taxonomy.

Research questions
Following research questions are addressed in this study: . RQ1: How to organize the knowledge on e®ort estimation in ASD? . RQ2: How can the agile e®ort estimation taxonomy be used to assess and improve reporting of research on e®ort estimation in ASD? . RQ3: What is the usefulness of the taxonomy for e®ort estimation practice in ASD?
RQ1 is answered by organizing the knowledge on e®ort estimation in ASD as a taxonomy. The dimensions and facets of the taxonomy are described in Sec. 4. RQ2 is answered by applying the classi¯cation mechanism of the proposed taxonomy to characterize the e®ort estimation activities of the agile projects reported in literature. The characterization is detailed in Sec. 4

Taxonomy design method
In this subsection, we brie°y present the method used in this study to design the agile estimation taxonomy. The method is depicted in Table 7. The method is an update of the method presented by Bayona-Or e et al. [18], based on our observations and experiences from a systematic mapping study [19] on taxonomies in the software engineering discipline. The revised method consists of four phases and 13 activities. Although the method is presented as sequential arrangement of phases and activities, the taxonomy designers, however, may¯nd it necessary to iterate between phases and the activities.

Phase 1: Planning
Planning is the¯rst phase wherein basic decisions about the taxonomy design are made. It includes the following activities: De¯ne SE KA: In this activity the SE KA or subarea is selected and described in which the taxonomy is designed. The Software Engineering Body of Knowledge (SWEBOK) version 3 [20] divides the SE discipline into 15 KAs (e.g. software requirements, software design, software construction, testing, etc.). These are further subdivided into subareas and sub-subareas.
The taxonomy proposed in this study is about e®ort estimation in the ASD context. E®ort estimation plays an important role in managing agile projects during release and sprint planning [21]. E®ort estimation falls within the scope of the \Software engineering management" KA in SWEBOK [20].
Describe objectives of the taxonomy: The objective(s) of the taxonomy should be clearly stated. The taxonomy designers should not leave it to the audience to guess the objectives of the taxonomy.
The main purpose of the taxonomy in this study is to propose a classi¯cation scheme that can be used to characterize e®ort estimation activity of an agile project. A number of studies, included in SLR [1] on e®ort estimation in ASD, have not reported important information related to the context, techniques and predictors used during e®ort estimation. The results of such studies are hard to analyze and compare, and therefore not very useful for practitioners. The proposed taxonomy could be used by researchers to consistently report important aspects related to e®ort estimation in ASD.
Agile practitioners, on the other hand, could use the taxonomy during estimation sessions to remind themselves of important aspects that should be considered to arrive at better e®ort estimates. After estimation session, characterization of the estimation activities using the proposed taxonomy will also facilitate the documentation of the information related to the e®ort estimation sessions, which otherwise remains mostly tacit. The teams can use this documentation in future estimation sessions.
Describe the subject matter to be classi¯ed: The taxonomy designers should make an explicit e®ort in describing what exactly is classi¯ed in the proposed taxonomy. The KA speci¯es the broad topic of the taxonomy, while the subject matter de¯nes what exactly is classi¯ed in the taxonomy. In a taxonomy of software testing techniques, for instance, software testing is the KA, while testing techniques are the subject matter.
E®ort estimation activities of the projects that are managed with an agile method is the subject matter of this taxonomy. Estimation activities are characterized by a number of factors such as estimation technique(s) used, considered e®ort predictors, context, etc.
Select classi¯cation structure type: After these initial steps the taxonomy designers have to select an appropriate classi¯cation structure. There are four basic classi¯cation structures: hierarchy, tree, paradigm and faceted analysis [22]. These structures have their own strengths and limitations, and there are situations in which one is more suitable than the rest. The taxonomy designers should also provide a clear justi¯cation for selecting a speci¯c classi¯cation structure.
We have selected faceted classi¯cation to structure our taxonomy. Faceted clas-si¯cation is suitable for the evolving areas, such as e®ort estimation in ASD, since it is not required to have complete knowledge of the area to design a facet-based taxonomy [22]. In faceted classi¯cation-based taxonomies, the subject matter is classied from multiple perspectives (facets). Each facet is independent and has its own attributes, making the facet-based taxonomies easily evolvable.
Select classi¯cation procedure type: The taxonomy designers also need to select and describe the type of the classi¯cation procedures that will be employed to assign samples of the subject matter to the most relevant categories. At a broader level there are two types of procedures depending upon the employed measurement system: qualitative and quantitative. The qualitative procedures are based on nominal scales wherein the relationship between the classes cannot be achieved. The researchers use the nominal scale to subjectively assign samples of the subject matter to the relevant classes. The quantitative classi¯cation procedures, on the other hand, are based on numerical scales [4].
Each facet of our taxonomy has a set of possible values. The details are described in Sec. 4. Based on the available data, we used the qualitative procedure to select relevant facet values to characterize a speci¯c estimation activity. In some cases it may not be possible to assign a value simply due to insu±cient data.
Identify information sources: Lastly in the planning phase, the information sources to extract the relevant data are identi¯ed.
The information sources used in the design of our agile estimation taxonomy consist of: (A) The peer-reviewed empirical studies on e®ort estimation in ASD published between 2001 and 2013. The evidence from these studies was aggregated in the SLR summarized in Sec. 2. (B) Agile practitioners who participated in our survey study, which is summarized in Sec. 2.

Phase 2: Identi¯cation and extraction
In the second phase the relevant data is identi¯ed and extracted from the information sources identi¯ed in the previous phase. It consists of the following activities: Extract all terms: The terms and concepts are extracted from the sources iden-ti¯ed in the planning phase. Data from the literature was extracted originally as part of the SLR [1]. The data from the agile practitioners was elicited as part of our survey study [2].
Perform terminology control: Next is to remove inconsistencies in the extracted data. It is possible that di®erent researchers have referred to the same concept with di®erent terms or di®erent concepts with the same terms. First level of terminology control was already performed during the data extraction and analysis phases of the two empirical studies (SLR and survey) that are providing the data for the taxonomy presented herein. We, therefore, only had to make few minor adjustments to consistently represent all terms and concepts.

Phase 3: Design
After the extraction phase, the next task is to identify the main dimensions and the categories therein along which the extracted data items could be organized. This phase consists of the following activities: Identify and de¯ne taxonomy dimensions: Faceted classi¯cation-based taxonomies have multiple dimensions (or perspectives) at the top along which a subject matter is classi¯ed. These dimensions can be identi¯ed in two di®erent ways: bottom-up and top-down. In the bottom-up approach the main dimensions and the categories emerge during the extraction and analysis process [23]. In the topdown approach, on the other hand, the taxonomy designers have a priori notion of some dimensions and categories [23] due to their knowledge of the selected knowledge area.
We have used a hybrid approach and identi¯ed following top-level dimensions of our taxonomy: estimation technique, e®ort predictors, e®ort estimate and estimation context. The¯rst three are part of most estimation processes described in the literature, see for example estimation process described in [16]. The fourth dimension (estimation context) emerged during the data extraction and analysis phases to de¯ne and document the environment in which the e®ort estimation activity is carried out in an agile project. These dimensions are detailed in Sec. 4.
Identify and describe categories of each dimension: The next step is to identify and describe the next levels of the taxonomy, which are referred here in taxonomy design method as categories. These categories can be identi¯ed by using the same top-down, bottom-up or hybrid approaches described above.
We refer these categories as facets in our taxonomy. Most of these facets emerged when the results of the SLR [1] and survey [2] studies were analyzed. The facets related to the techniques were grouped under estimation technique dimension. Similarly cost drivers and size measures were grouped under e®ort predictors dimension. The data items related to the e®ort estimate (e.g. unit of estimate) were categorized under the e®ort estimate dimension. The SLR and survey studies also identi¯ed number of factors that characterize the context of the estimation activity. The identi¯cation of these factors leads to the fourth dimension of the taxonomy, i.e. estimation context. The facets are described in detail in Sec. 4.
Identify and describe relationships, if any: The relationships between dimensions and categories are identi¯ed and described clearly. Note that in some cases there is no relationship between the dimensions, i.e. this activity might be skipped.
The objective of the taxonomy proposed in this study is to provide a classi¯cation mechanism to characterize e®ort estimation activity in an ASD context. Each dimension is a grouping of facets that are closely related to each other. The four dimensions together can characterize an estimation activity. However no other speci¯c relationships between dimensions are de¯ned, allowing for easy addition of new dimensions and the facets.
De¯ne guidelines for using and updating the taxonomy: The aim of this activity is to describe how the taxonomy can be used. As a result of the application of the taxonomy and other future research e®orts, it is highly likely that the need to update the taxonomy emerges. The taxonomy designers should also describe how the taxonomy can be updated in future.
The proposed e®ort estimation taxonomy can be used to assess and improve the reporting of studies on e®ort estimation in ASD. The taxonomy facets can be used as a checklist to ensure that all important factors about e®ort estimation are reported. Practitioners can document important information by using the taxonomy to characterize the speci¯c estimation activity of an agile project. This documented information can serve as an important repository in improving future e®ort estimation. We applied the proposed agile taxonomy on knowledge extracted from literature and industry to demonstrate these uses. The details are provided in Secs. 4.2 and 4.3.
The dimensions and the facets of the proposed taxonomy are not exhaustive, and should be updated based on future research e®orts and feedback from industry. At present they are based on knowledge that we aggregated in our SLR [1] and survey [2]. We have used faceted classi¯cation in our taxonomy. Faceted classi¯cation is a°e xible structure wherein new facets can be added relatively easily. One likely extension is under the facet \e®ort predictor." Currently we only included those cost drivers that are identi¯ed as important in the SLR and the survey. The future research on e®ort estimation in ASD may and should investigate more cost drivers, which could then be included in our taxonomy.

Phase 4: Validation
Validation is the last phase of this method wherein the taxonomy designers attempt to demonstrate the working or usefulness of the designed taxonomy. This phase has only one activity. A number of approaches can be used to validate a taxonomy. It includes establishing the orthogonality of taxonomy dimensions, comparing with existing taxonomies and demonstrating its usefulness by applying it on existing literature [24].
Since there is no existing taxonomy on e®ort estimation in ASD, benchmarking our taxonomy against existing ones is not strictly possible. However, in Sec. 5 we compare our taxonomy with a recent and extensive review of the software e®ort estimation area by Trendowicz and Je®ery [17]. Their work is selected for the comparison as it is not only recent and comprehensive, but it also covers aspects related to our taxonomy dimensions. Orthogonality of the taxonomy dimensions is discussed in Sec. 4.1. The four dimensions (Fig. 1), representing the top level of taxonomy, are clearly distinguishable from each other. E®ort estimation is performed in a certain Context, wherein agile teams use certain estimation technique(s) and e®ort predictors to arrive at an e®ort estimate.
Two methods are employed to evaluate the usefulness of the proposed taxonomy. These methods are brie°y described here.
Taxonomy evaluation À À À Literature: The proposed taxonomy is used to characterize 10 estimation cases reported in the literature. The studies were selected from the set of primary studies included in our SLR on e®ort estimation in ASD. We selected only those studies that were based on real agile projects. An extraction sheet was prepared with¯elds corresponding to the facets of the proposed taxonomy. The data reported in the selected studies was extracted and entered in the sheet with respect to the facets of the taxonomy. The results of the evaluation are presented in Sec. 4.2.
Taxonomy evaluation À À À Industry: The usefulness of the proposed taxonomy is also demonstrated by classifying and characterizing the e®ort estimation activities of¯ve agile teams. We conducted interviews of four agile practitioners about e®ort estimation activities that their teams performed in recent sprint or release planning meeting. These four agile practitioners represent a convenience sample, i.e. they were recruited based on our personal contacts. They belong to three di®erent companies, and An E®ort Estimation Taxonomy for ASD 653 provided data about e®ort estimation activities performed in the development of¯ve di®erent products. Two interviewees lead the e®ort estimation sessions, one participate as testing lead and one as developer in their respective companies. The details of the companies and products are described in Sec. 4.3. Table 8 lists their brief pro¯les.
The interviews were semistructured and were conducted by the¯rst author. The interviewees were¯rst asked to describe when and how e®ort estimation is carried out in their respective teams. The interviewees were then asked to select recent instances of estimation sessions in which they participated. The taxonomy dimensions and the facets were used to ask questions and elicit the data required to characterize the speci¯c estimation activity. The interviews lasted, on an average, for an hour. After the interviews, the data extraction sheets were sent to the respective interviewees for further feedback, validation and corrections, if required. The results of the evaluation are presented in Sec. 4.3.
Together these four practitioners from three di®erent companies provided data about six di®erent estimation sessions corresponding to¯ve di®erent products and teams.

Results
In this section¯rst we present the agile estimation taxonomy to answer RQ1. Later, the applications of taxonomy in the literature and the industrial estimation cases are presented to answer RQ2 and RQ3, respectively.

RQ1: Organizing the knowledge on e®ort estimation in ASD
We have organized the identi¯ed knowledge on e®ort estimation in ASD as a taxonomy. Taxonomies are an e®ective tool to organize and communicate the knowledge in an area [3]. The proposed taxonomy was created by following a method that was described in Sec. 3. In this subsection, we present the outcome of that method, i.e. the agile estimation taxonomy.
At the top level the taxonomy has four dimensions, which are presented in Fig. 1. A dimension encompasses all facets that are related to each other. The dimension \E®ort predictors," for example, includes e®ort predictors employed during estimation. These four dimensions de¯ne the¯rst level of taxonomy, and provide an overview of the taxonomy at a broader level. Estimation context represents grouping of those facets (or categories in more general terms) that de¯ne and characterize the context in which the estimation activity is carried out in an agile project. It includes, beside others, agile method in use, team size, project domain, etc. E®ort predictors dimension encompasses predictors used during e®ort estimation. Estimation technique dimension includes facets related to the technique used to arrive at e®ort estimates. E®ort estimate includes the facets related to e®ort estimates such as estimate value, estimate type, etc.
In order to completely characterize a speci¯c e®ort estimation activity of an agile project, facets of all dimensions need to be described. In the following, we describe each of these four dimensions and their facets in detail.

Estimation context
Context is an inseparable part of the software practice, and it needs to be properly captured and reported in the empirical studies to communicate the applicability of the research¯ndings. The empirical research¯ndings are not universal due to the unique context in which the research is conducted. The empirical research needs to be properly contextualized so that the readers can determine the relevance of the results for their contexts, i.e. what is applicable for a company, and when and where to use the reported¯ndings [25].
The context dimension has eight facets. These facets and their possible values are depicted in Fig. 2. They are described in the following.
. Planning level: Estimation supports planning at di®erent levels in ASD. It includes mainly release and sprint planning, while some teams may also estimate during daily meetings [21]. Project bidding is another level wherein companies have to estimate the total development e®ort upfront in order to bid for the projects. . Main estimated activity: This facet describes which development activities are accounted for in the e®ort estimates. For example, does the total e®ort estimate include the time spent on¯xing bugs, i.e. maintenance, or it is considered separately. It is possible that some team members have di®erent understandings towards the estimation scope due to, for example, the reason that the team is newly   Fig. 2. . Project setting: This facet captures the setting in which the agile teams are developing the product. There are two broad settings, co-located team wherein all members work at the same location, and distributed development wherein the teams are placed at di®erent geographical locations. The latter introduces additional challenges related to communication and coordination. In case of the distributed teams many di®erent settings, depending on the temporal and geographical distances, are possible. Smite et al. [24] proposed a global software engineering (GSE) taxonomy that characterizes di®erent settings of the global teams. We have used situations in their GSE taxonomy as example facet values in Fig. 2. The teams in distributed development could be working either in an onshore or an o®shore setting. Requirements can be speci¯ed in di®erent ways and at varying levels of granularity. In ASD requirements are mostly speci¯ed in the form of user stories. In some cases user stories are estimated, while other teams divide the stories into tasks, which are then estimated. This facet documents the entity (e.g. task, user story or use case) that is being estimated. . Number of entities estimated: It records the number of user stories or tasks estimated in a speci¯c estimation session, and thus conveys important information related to the size of the sprint or release. . Team size: It documents the size of the team which is responsible for developing the estimated tasks or stories.

E®ort predictors
E®ort predictors play an important role in arriving at e®ort estimates. They consist of size and other cost drivers such as team capabilities, nonfunctional requirements, etc. Product size is considered as one of the main determinants of the required development e®ort [17]. Cost drivers also contribute to the required e®ort depending on the context. The e®ort predictors dimension has six facets. These facets and their possible values are depicted in Fig. 3.
. Product size: Development e®ort, in general, is strongly correlated with product size. Many research e®orts have been made to empirically relate e®ort with product size [26]. This facet documents whether the agile team uses size as a  predictor and which metric is used to represent the size. It is also possible that estimators considered the task or story size as one of the predictors without using any speci¯c size metric. . Team's prior experience: A development team's prior experience with similar tasks impacts the required e®ort. Estimated e®ort for sprints that involve similar and familiar tasks will be low as compared to a sprint involving unfamiliar tasks. This facet describes whether a team's prior experience was considered in arriving at the e®ort estimates. . Team's skill level: Team's skill level also impacts the required e®ort. This facet documents whether skill level of the team members was considered during the e®ort estimation session. . Nonfunctional requirements: Stringent nonfunctional requirements increase the development e®ort. This facet records whether or not, and which, nonfunctional requirements were considered in arriving at e®ort estimates. Chung et al. [27] provided a review of the types and classi¯cations of the nonfunctional requirements. We have used few nonfunctional requirements, from a large list included in their work, as example facet values in Fig. 3. . Distributed teams' issues: How many sites collaboratively develop software also has an impact on the development e®ort due to the increased complexity of the collaboration and communication. The cultural and geographical barriers between distributed development teams can increase the development e®ort. The actual facet values for some cost drivers, such as team's skill level, in Fig. 3 are not provided. At the¯rst level it is important to characterize whether or not certain cost drivers are considered relevant during e®ort estimation. It is not simple to quantitatively measure these cost drivers. A commonly used approach in most estimation methods, such as COnstructive COst MOdel (COCOMO) [28], is to rank these cost drivers on an ordinal scale such as low, medium or high. A similar approach can be used here, if speci¯c quantitative constructs or measures are absent.

Estimation technique
This dimension encompasses facets related to estimation techniques that should be documented in order to characterize estimation activities of an agile team. The facets of this dimension and their corresponding example values are depicted in Fig. 4. The agile teams can also use a combination of techniques to arrive at an e®ort estimate. . Type of technique: There are di®erent types of estimation techniques and models such as algorithmic, expert-based and arti¯cial intelligence-based techniques and models. In ASD, estimation techniques that rely on subjective assessment of experts are frequently used [1]. These techniques can be used by an individual or a group of experts. This facet documents whether the e®ort was estimated using an individual or group-based estimation technique.

E®ort estimate
E®ort estimates are the main output of the estimation activity. The facets in this dimension characterize e®ort estimate. These facets and their corresponding values are depicted in Fig. 5.
. Estimated e®ort: This facet documents the main output of the estimation session or activity, i.e. the estimated e®ort. . Actual e®ort: It is also important to have the actual e®ort, at the end of release or sprint, to enable comparison with the estimated e®ort. . Estimate type: The estimate could be of di®erent types, e.g. point estimate, threepoint estimate or a distribution. This facet documents the type of estimate used by the agile team.

RQ2: Reporting of research on e®ort estimation in ASD
In order to evaluate usefulness of the proposed taxonomy in assessing the completeness of reporting, we used the taxonomy to characterize 10 e®ort estimation cases reported in the literature. The estimation cases were elicited from eight empirical studies on e®ort estimation in ASD. The characterizations are presented in Table 9. Agile estimation taxonomy is able to describe the e®ort estimation cases reported in literature. It is however interesting to note that some studies do not report important information, which makes it hard to compare and analyze results of these studies. We observed the following speci¯c issues during this exercise: . Authors make an e®ort to describe the context in which estimation studies are conducted. However in many cases the context is very brie°y described, and is scattered in the report. We were not able to extract project domain, planning  level and the number of estimated stories/tasks in two, three and four cases, respectively. . The reporting of the studies lack the details with respect to the e®ort predictors used during e®ort estimation. In six cases we were not even able to establish whether or not, and which, cost drivers were used. It is hard to imagine a real scenario wherein e®ort is estimated without considering any e®ort predictors. It is possible that the focus of a study is not on e®ort predictors. However this then should be explained and described explicitly. . Two studies have not described the technique used to estimate e®ort. The remaining eight studies have described the used technique and corresponding accuracy measure used. However we were not able to determine in¯ve cases whether the technique was used by an individual or group of experts. . Actual and estimated e®orts are not reported in six cases, which makes it hard to analyze and compare the¯ndings.
The proposed taxonomy can be used by researchers to make sure that important information related to agile e®ort estimation is duly reported. This, we hope, will improve the reporting and would facilitate comparing and analyzing the results of studies on e®ort estimation in ASD.

RQ3: Usefulness of taxonomy for e®ort estimation practice in ASD
The aim of this RQ was to evaluate if the proposed taxonomy can be used by agile teams in documenting their estimation sessions for future use. We used the taxonomy to classify and characterize the estimation activity of¯ve agile teams working in three di®erent companies. First we provide a brief description of the companies, teams and related products. The companies are referred here as A, B and C due to privacy concerns.

Company A
Company A is a medium-sized software company established in the mid-90s headquartered in Finland with o±ces in Sweden and India as well. A develops and o®ers Business Support Solutions (BSS) to telecommunication service providers. It o®ers BSS services related to customer, product and revenue management. We collected the e®ort estimation-related data from the recent sprint planning meetings of two di®erent teams at company A. We refer to these two teams as A1 and A2. A sprint normally lasts for two weeks. Both teams are located in Sweden, and are using Scrum as development process. A1 is responsible for developing Order Management System (OMS) of BSS, while A2 is developing Customer Information System (CIS). We interviewed a developer from team A1 and a testing lead from A2, who both participated in the sprint planning meetings of their respective teams. Each An E®ort Estimation Taxonomy for ASD 663 interview lasted for about an hour. The e®ort estimation data was entered in extraction sheets by the interviewer, which were later sent to the interviewees for corrections and adding the missing information, if any. Our agile estimation taxonomy was used (see Table 10) to classify the speci¯c e®ort estimation activities of both teams.

Company B
Company B is a very large organization, headquartered in Sweden, providing services and support systems related to communication technology. B develops and provides software solutions for Operations and Business Support Solutions (OSS/BSS). We interviewed a product owner for 2 h who leads the release plan meetings that involve both design and testing leads and other team members. The teams are based in Sweden, and are following Scrum as their development process. They are responsible for developing one subsystem of a large billing system.
We elicited e®ort estimation data corresponding to two di®erent releases; one is based on customer requirements, while the other is research-based internal release. We used the data elicited in the interview to characterize the e®ort estimation  Table 11.

Company C
Company C is a small to medium-sized Pakistan-based software company established in the late 90s. It has around 50 employees based in Islamabad, Pakistan, and develops mainly performance enhancements and database applications. C follows a customized version of Scrum in all projects. We collected e®ort estimation data, related to our taxonomy, from the release planning meetings of two teams working on two di®erent products. We refer to the two teams as C1 and C2. A release normally consists of four iterations, and takes about 4-6 months to complete. An iteration is usually 4-6 weeks long. Data was collected in a 1.5-h interview of the development manager of the company who is responsible for overseeing all development teams of the company. He has the additional responsibility of managing the systems' architecture as well. He leads both the release and sprint plan meetings of both products, which are also attended by relevant development and testing leads. The¯rst author conducted the interview and recorded the responses in an excel¯le. The excel¯le was later sent to the interviewee for entering missing information and validation of the recorded information.
Team C1 consists of six members, and is responsible for developing a performance enhancement application (referred as P1) for .Net and Java development environments, while team C2 consists of 10 members and is developing an NoSQL-based database product (P2). We collected the e®ort estimation-related data from the sprint plan meetings for the¯rst three sprints of both products. The e®ort estimation activities of both teams C1 and C2 are characterized using our agile estimation taxonomy in Table 12. We only used the data of the¯rst sprints for both products.

Summary
The interviewed practitioners found the taxonomy useful in characterizing their estimation activities. They noted that this o®ers a simple mechanism to document this information, which can be used subsequently. We asked them to suggest any changes or updates in the taxonomy when we sent the extraction sheet for validating extracted data. The following three extensions were suggested in agile estimation taxonomy. . When the estimation is done as part of release planning, the taxonomy should also document the number and duration of the iterations/sprints in a release. . While capturing the context, the taxonomy should also characterize the development mode, i.e. whether the agile team is engaged in a product-or project-based development. These suggested extensions are added in the context dimension of the agile estimation taxonomy. Figure 6 presents the extensions as new facets in the context dimension. It is relatively easy to add or remove a facet in any dimension. This is one of the bene¯ts of using the faceted classi¯cation structure in a taxonomy.
We have used the categories of software types proposed by Forward and Lethbridge [37] as example facet values for the application-type facet in Fig. 6. These are high-level categories. More speci¯c types of the software applications are also listed in their taxonomy.

Discussion and Validity Threats
The agile estimation taxonomy is based on the knowledge identi¯ed in our SLR [1] and survey [2] studies. Companies and researchers can introduce new dimensions and  facets into the taxonomy. Since we used a faceted classi¯cation structure in designing this taxonomy, it is°exible enough to accommodate changes in the future.
In this section, we compare our taxonomy with an existing work and also describe limitations of our work.

Comparing agile estimation taxonomy with existing work
Trendowicz and Je®ery [17] provide a comprehensive review on software e®ort estimation covering aspects related to e®ort estimation foundations, techniques, drivers, context and guidelines to select an appropriate technique in a speci¯c context. We compare dimensions and facets of our taxonomy with the relevant aspects of the work by Trendowicz and Je®ery c [17]. It is important to note the following points before moving further.
. Trendowicz and Je®ery do not present their work as a taxonomy or classi¯cation.
However, the book covers many important aspects related to software e®ort estimation, and therefore provides an opportunity for comparing our taxonomy. . The TJ estimation reference is about software e®ort estimation in general, is not meant speci¯cally for ASD. . The evidence identi¯ed and analyzed in our SLR [1] served as one of the inputs to the agile estimation taxonomy presented in this paper. The SLR was published in 2014, and included only peer-reviewed works published before December 2013. The TJ estimation reference was published as a book in 2014. Our taxonomy and the TJ estimation reference can therefore be considered as independent works.

E®ort estimation context
The facets corresponding to the estimation context dimension of our agile estimation taxonomy are compared with the contextual factors in the TJ estimation reference in Table 13. There are six contextual facets that are not covered in the TJ estimation reference: planning level, main estimated activity, project/product domain, type and number of estimation entities, development mode and iteration/sprint details. The TJ estimation reference is general and therefore some factors that are more speci¯c for ASD are not covered. Planning levels (i.e. release planning, iteration planning), estimation entities such as user stories and tasks and concept of sprint/iterations are more speci¯c to the ASD context. There are two contextual factors listed in the TJ estimation reference that are not part of our taxonomy: the programming language and the development type (new development or enhancement). These could be added as new facets in the estimation context dimension in our taxonomy as both provide additional information that might be helpful in characterizing the estimation context. c We refer to their work as \TJ estimation reference" in this comparison.

E®ort estimation techniques
The TJ estimation reference includes a very detailed classi¯cation of the e®ort estimation techniques. The classi¯cation is presented as a hierarchy which includes speci¯c techniques as leaf nodes and their types at higher levels of the hierarchy. The facets of the estimation technique dimension are compared with this work in Table 14. All facets of the estimation technique dimension of our taxonomy are also covered in the TJ estimation reference.

E®ort predictors
All facets of the e®ort predictors dimension of our agile estimation taxonomy are also present in the TJ estimation reference (see Table 15).
There are many e®ort predictors, described in the TJ estimation reference as common e®ort drivers, that are not part of our taxonomy: software, database and architecture complexity, requirements novelty, requirements stability, project constraints, schedule pressure, tool usage, CASE tools and testing tools. These e®ort predictors are currently not part of our taxonomy as we could not¯nd any evidence wherein these have been investigated as drivers of project e®ort in ASD. There is a need to investigate whether these generic e®ort drivers are relevant in an ASD context.

E®ort estimate
Apart from estimation unit and accuracy level, other facets of this dimension of our taxonomy are described in the TJ estimation reference (see Table 16). These are, however, described in di®erent places in the book. Some studies only report the accuracy level achieved in terms of percentages (e.g. MMRE: 30%), and do not specify the actual and estimated e®orts. As for estimation unit, in ASD some teams use the concept of ideal time as the unit of e®ort instead of the standard person hours/month unit [21].

Summary
The main¯ndings of this comparison are described in the following: . There are aspects that are speci¯c to ASD, such as some facets in the context dimension. These are not part of TJ estimation reference, which is about e®ort estimation in general. Agile practitioners should therefore consult estimation works, such as our taxonomy, that are speci¯c to ASD besides considering general works on e®ort estimation. . There are many aspects, such as some cost drivers, included in the TJ estimation reference that are not covered by our agile estimation taxonomy. There is a need to investigate the relevance and importance of these e®ort drivers in an ASD context in the future research e®orts.

Validity threats
We evaluated the proposed taxonomy by using it to characterize estimation cases from the literature and the industry. We also compared our taxonomy with a reference work by Trendowicz and Je®ery [17]. However, this study may still have some limitations. We discuss the validity threats using the categorization proposed by Petersen and Gencel [38] corresponding to two main phases of research: data collection and analysis.
. Descriptive validity is concerned with issues that may arise due to poor gathering and recording of data, which would result in the inaccurate description of the \truth". The data identi¯ed and aggregated in the SLR [1] and the survey [2] was used as main input to the taxonomy designed in this study. In the SLR the studies were identi¯ed and analyzed following Kitchenham and Charters guidelines [7]. The survey instrument (questionnaire) was designed and reviewed iteratively to ensure readability and correctness. The survey was based on a sample of 60 agile practitioners. The validity threats and the corresponding mitigating actions for the SLR and the survey are described in [1,2] respectively. . Interpretive validity is concerned with ensuring that reasonable conclusions are drawn based on the collected data, and the issues such as researchers' bias do not lead to incorrect conclusions. We applied a taxonomy design method to organize the knowledge on e®ort estimation in ASD as taxonomy in a systematic manner. The taxonomy was evaluated by using it to characterize estimation cases from the literature and the industry. The taxonomy was also compared with a recent detailed review on software e®ort estimation [17]. . The taxonomy is applied to characterize 10 agile e®ort estimation cases from the literature and six estimation sessions of¯ve di®erent agile teams. The results are generalizable to only those projects and teams that have similar contexts.
The authors of this study have experience in ASD and/or e®ort estimation. We followed a method to design the taxonomy in a systematic way. However, we still believe that the involvement of more experts could have further improved the completeness and utility of the proposed taxonomy.

Conclusions and Future Work
The design of taxonomies helps in structuring and communicating existing knowledge, and in advancing the research [3]. We organized the existing body of knowledge on e®ort estimation in agile software development as a taxonomy. The taxonomy was developed in a systematic way by following a taxonomy design method. The usefulness of the taxonomy was demonstrated by applying it on data extracted from literature and industry. The taxonomy was applied to characterize 10 selected estimation cases reported in the literature. It was observed that some studies do not report important information related to e®ort estimates, e®ort predictors used and the context in which the estimation activity was performed. The taxonomy can be used by researchers in reporting their¯ndings in a consistent way to allow the audience to compare, analyze and aggregate their¯ndings. The taxonomy was also applied to characterize the estimation activities of agile teams of three companies.
The involved practitioners noted that the taxonomy provides a useful mechanism to document their estimation sessions.
The taxonomy has not been used during e®ort estimation sessions. We plan to develop a checklist instrument based on our agile estimation taxonomy with the aim to support agile practitioners during estimation sessions. The idea is to employ the checklist during sprint or release plan meetings while estimation is in progress. Practitioners could use the checklist to remind themselves of important factors that should be considered while estimating user stories or tasks. It will also support in documenting the knowledge used by agile practitioners to arrive at e®ort estimates, which otherwise would remain tacit in most cases. Our taxonomy could also be used as a tool in the development of a repository of the e®ort estimation knowledge, which could be used in the long term to better understand and improve the practice of e®ort estimation in ASD.