-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathmaking-action.tex
360 lines (303 loc) · 32.2 KB
/
making-action.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
% !TEX root = thesis.tex
%\startchapter{Proof of Concept: Can we generate useful recommendations?}
\startchapter{Appropriateness of Technical Relations}
\label{chap:actionable}
In this chapter, we present the final piece of evidence in the line of exploring the usefulness of our approach presented in Chapter~\ref{chap:approach}.
With a case study, we demonstrate that we can generate recommendations to prevent build failures when it actually matters.
Therefore, this chapter explores the following research question:
\begin{description}
\item[RQ 2.3:] Can recommendation actually prevent build failures?
\end{description}
In essence, generating recommendations after the fact, even if the form and content in generally accepted by developers, is of little use.
This is because it is too late to act upon it.
In this chapter, we explore not only whether the recommendation, if given at the right moment, could have prevented a build from failing, but also if we could have generated that very recommendation at an appropriate time for preventative actions to take place.
In the remainder of this chapter, %we start with detailing the background as to how we can generate recommendations in ``real time'' (Section~\ref{ch10:bg}).
%Subsequently,
we describe the course setting in Section~\ref{ch10:setting}.
Following this, we describe our methodology in Section~\ref{ch10:meth}.
Then, we present the analysis and results we obtained in Section~\ref{ch10:ana} and Section~\ref{ch10:findings}, respectively, followed by a discussion of the results in Sections~\ref{ch10:dis}.
We conclude this chapter by offering an answer to our research question (cf. Section~\ref{ch10:con}).
\section{A Course on Globally Distributed Software Development}
\label{ch10:setting}
We conduct a case study of the collaboration of students in a project taken in a class in globally distributed software development taught at the University of Victoria, Canada and Aalto University, Finland, as the basis for exploring our final research question for several reasons:
\begin{itemize}
\item Control over a project infrastructure that supports data collection.
\item The distribution of the course across countries ensures a minimum of electronically recordable and mineable communication data being generated.
\item A course setting allows for more control to ensure data quality.
\item A course setting enables better access to study participants.
\end{itemize}
One key to exploring our research question is to collect data to analyze.
In a course project we are able to choose and instrument the infrastructure used by the students to actively develop software, and thus possibly gain more insight, or at least miss less data, than in most industrial settings that rarely focus on collecting data for analysis.
This especially allows us to establish traceability links between artifacts that are often missing and need to be inferred using semi-accurate heuristics.
Furthermore, the distributed nature of the project makes synchronous non-text based communication more difficult, and thus increasing the use of text based asynchronous, and therefore recorded, communication.
This additional need to rely on flexible and stable communication media, such as email and text based chats, enables us to collect more data that is easier to mine than it would have been realistic in a collocated setting.
The distributed nature of the teams, also induced an openness to work in a more distributed fashion even among collocated team members due to a sensibilization of trying to enable their remote team members to follow their discussions.
Due to the nature of the study being conducted in a course setting, we had more control about how participants are using technology, and could give guidance on how to improve their collaboration through means that are easy to record.
Because participants need to attend lectures, we can educate them in a way that emphasizes the importance of recording traceability links.
Since the participants are students taking a class on software development, we can more easily, ensure the quality of data that we are collecting.
Through a feedback loop we can change the data collection to accommodate the needs of the students more easily and therefore ensure data quality by mitigating the risk of participants neglecting the data collection process.
\begin{table}[t]
\centering
\caption{Descriptive statistics about the student development effort}
\begin{tabular}{rrrrr}
\toprule
& Team A & Team B & Team C & Total\\
\midrule
\#failed builds &15&13&19&47\\
\#successful builds &41&14&1&56\\
\#builds & 56 & 27 & 20 & 103 \\
\#Canadians & 4 & 4 & 4 & 12\\
\#Fins & 2 & 2 & 2 & 6\\
%\#User Stories & & & & \\
\#Change Sets & 440 & 491 & 452& 1383\\
\bottomrule
\end{tabular}
\label{tab:gsd:desc:stats}
\vspace{-5pt}
\end{table}
%Team C = 35+62+50+44+47+19+21+51+22+43+42+7+2+3+1+3
%Team B = 20+52+64+50+44+47+19+21+51+22+43+42+7+2+3+1+3
%Team A = 4 +17+ 64+ 50+ 44+ 47+ 19+ 21+ 51+ 22+ 43+ 42+ 7 +2+ 3+ 1+ 3
\subsection{Course Details}
The globally distributed software development course taught at both the University of Victoria, Canada and Aalto University, Finland, aims at teaching students how to develop software in a distributed setting.
In this particular course student teams where forced to deal with the issues that come with a distributed team in an extreme way since both West Coast Canada and Finland are separated by a time difference of eleven hours minimizing the time all team members are awake.
This both necessitates and motivates the students to apply lessons learned in the classroom.
The course on globally distributed software development took place starting in October 2011 and ended in April 2012.
With the Finnish students starting in October 2011 and the Canadian students joining the development effort in January 2012.
This further allowed the students to engage in remote collaboration as the Finish students represent the experts on the system the teams are working on and will need to guide their Canadian teammates in getting used to the projects layout and functions.
Table~\ref{tab:gsd:desc:stats} shows some descriptive statistics about the development effort the students undertook.
The 18 student developers together committed a total of 1383 change-sets while creating 103 builds.
We can see that some teams subscribed to the frequent build mentality more than others. For example, Team B and C created builds substantially less often than Team A over the same time window.
Besides developing a software product, the Canadian students also had regular classes.
During a regular class the students participate in a seminar style setting that requires each student to read a set of papers.
These papers are discussed on the course blog with respect to implications of the findings detailed in the papers and how these findings may apply to their current development experience.
Every two weeks the seminar style class is interrupted for a project wide meeting to bring together all the teams including the Finnish team members, to show case their progress and to plan the next steps for the following two weeks.
The students filled out a survey every other week to give feedback on the class, the development process, and the use of different tools for communication and development.
Besides this bi-weekly surveys, the students also had the chance to talk to the course instructors, as well as the teaching assistant, to issue concerns and make suggestions on how to improve the course for the students, as well as bring up issues concerned with the quality of the collected or non-collected data.
\subsection{Team Composition}
As described earlier, each development team consists of both students from Canada and Finland.
There are three teams, each of which consist of of four students from Canada and two students from Finland.
In addition to the three teams, we also have a product owner that came up with the idea of the product that we describe in greater detail in Subsection~\ref{chap:make:subset:devproject}.
The product owner is located in Finland, and is responsible for prioritizing the different features that he wants to be implemented.
Moreover, he negotiates with the teams which items they should aim for within a given two week span.
To coordinate the work across teams, one Finish student has become the SCRUM master or de-facto project manager.
He is both the most experienced developer, and has the most knowledge about the projects, as he is using the product within his own company.
Some of his responsibilities entail facilitating coordination across teams, setting up meetings between teams, and structuring the biweekly project meetings.
He also takes care of scoping out development tools to use and ensures that everyone is up to speed with issues arising from other teams that may implicate their work.
\subsection{Development Project}
\label{chap:make:subset:devproject}
The students in this course are extending the agile-planning tool Agilefant.\footnote{www,agilefant.org}
Agilefant combines task creation, task management, and planning into one tool that allows scheduling different tasks for different sprints, with a sprint being a time period that is meant to implement a fixed set of tasks.
Furthermore, tasks can be organized hierarchically and be of different types, such as user stories or bug reports.
As mentioned previously, this product was conceived by the product owner and has since been developed by students participating in cap-stone projects at Aalto University, Finland.
The tool itself is open source and available for anyone to download, install, and extend as they see fit.
Besides being a student driven project it found a number of users in industry.
Due to the development being mainly undertaken in the cap-stone projects that are offered yearly to the students, the project progresses in yearly bursts.
Nevertheless, the stability of its core feature and lightweight approach made it popular among small and mid-sized companies for planning their development projects.
\subsection{Development Process}
The project follows a SCRUM process\footnote{http://en.wikipedia.org/wiki/SCRUM\_(development)}, and thus development progresses in small sprints working of a common backlog.
A sprint, in the case of our project, is defined to be two-weeks long starting with a planning session and ending in a retrospective.
The planning sessions and retrospectives take place between two sprints at the same day starting with the retrospective followed by the planning for the next sprint.
\vspace{-5pt}
\subsection{IT-Infrastructure}
\vspace{-5pt}
The source code is managed using GIT\footnote{http://git-scm.com/} and is available on github.com\footnote{https://github.com}.
Although Agilefant or github.com issues can be used for task management, both are missing important features.
For example, github.com is missing the planning capabilities of a full agile planning tool and Agilefant is missing the ability to discuss work items.
Therefore, we use IBM Rational Team Concert for the planning of sprints and tracking of work items.
An additional advantage of IBM Rational Team Concert is that it integrates with Eclipse the main development environment, which also integrates with GIT.
Besides using discussions on work items, the students also use email and IRC for communication.
To ensure that everyone has the same knowledge students are encouraged to summarize their emails and IRC discussions in work item discussions.
For group meetings, especially the planning and retrospective sessions, we provided the student with separate rooms and internet access to use Google hangout for video chatting and screen sharing when discussing their group individual sprint plans.
For the project meeting, after each sprint to demo their accomplishment to the development team at large, we provided a conference room with multiple microphones and screens to simultaneously show a number of remote participants as well as screen sharing sessions.
To further support the development process, the students set up a Jenkins instance, which is an open source web-based build system.
This build system allows for continuous testing which helps the students to get a better understanding of their progress with respect to implementing their tasks for a given sprint.
\section{Methodology}
\label{ch10:meth}
\subsection{Proximity to Infer Real Time Technical Networks}
\label{chap:making:subset:proximity}
In order to be able to explore the possibility of recommending improvements with respect to the socio-technical networks, we need to be able to intervene before the majority of work has been done.
The main issue with creating real time socio-technical networks lies in constructing technical networks, as they are only known once a developer submits their changes to a repository.
On the other hand, social-networks can be more readily constructed in real time because in order to communicate, at least electronically, yields a trace for each bit of communication traveling from one person to another allowing the extension of a social network with each new instance of communication.
To gain a similarly effective way of constructing technical networks we build upon the work of Blincoe et al.~\cite{blincoe:cscw:2012}, who proposed a measure called proximity using Mylyn~\cite{kersten:aosd:2005} traces to infer real time developer dependencies.
Mylyn is an Eclipse plug-in that records edits and selects within source code files.
These events show that artifacts were modified and selected, thus allowing the partial construction of a technical network based on ownership information of the modified, selected, and implicated source code artifacts.
In order to record this information in real time in a central repository we developed a plug-in that captures events recorded by Mylyn, connecting them to the current task a developer is working on, and submitting this information to a central server.
To encourage students further to install this plug-in, we also provided a visualization of the current proximity of a developer to any other developer given her current task in a visualization plug-in called ProxiScentia~\cite{borici:chase:2012}.
\subsection{Data Collection}
For our study we collect six types of data:
\begin{itemize}
\item Communication data such as e-mails, IRC chat logs, and work item discussions.
\item The source code repositories on github.com, including each developer's fork of the main repository.
\item Log of the build engine indicating failed and successful builds.
\item Documentation and diaries created by students during the course.
\item Surveys and questionnaires completed by students throughout the course.
\item Mylyn events gathered using our custom built Eclipse plug-in.
\end{itemize}
Although we are not necessarily interested in constructing social networks for this part of the study, but rather we are concerned whether it is possible to prevent build failures given a certain set of collectable data, we need access to as much communication data as possible not only as a means to see what kind of problems arose but also as to whether they were of any importance to the developers.
%
For example, are build failures a problem for the developers and if so are all build failures equal.
Knowing which build failures are actually problematic to the developers allows us to focus on those instances that matter.
To further the investigation into the issues reported by the developers, we make use of the version control systems.
In our case, these are several GIT repositories hosted on github.com.
Each developer owns their own forked repository from which they pushe changes to their respective team repository.
Once the product owner has reviewed a feature implementation or bug fix, it is pushed to the project repository to ensure that each development team is working on the latest stable code.
Since implementing quality features was a main concern during the class, the students set up a build-engine to continuously test their implementations.
Jenkins,\footnote{\url{http://jenkins-ci.org/}} the build engine used by the teams, keeps a log of all the builds as well as their outcomes.
Besides collecting data that is generated by the development activities of the students during the course, we also asked them to keep a development diary as well as fill out frequent surveys that gauge their impression of the progress within the project as well as the adequacy of the tools used.
Finally, as mentioned in the previous Subsection~\ref{chap:making:subset:proximity}, we also collect the code interactions with respect to which methods in the source code have been edited or selected in a central repository.
\subsection{Analysis}
\label{ch10:ana}
We analyze data collected during the course in six steps, starting with identifying issues of interest to show how recommendations based on our approach could have avoided these build failures:
\begin{enumerate}
\item Identify build issues mentioned by the developers.
\item Identify the actual failed build in the Jenkins build logs.
\item Identify the code changes that contributed to the build failure.
\item Check if the recorded code interactions could have predicted the build failure.
\item Check if the relation between the code changes could have predicted the build failure.
\item Check if using the social-network helps to find the right people to include in order to resolve the issue.
\end{enumerate}
\paragraph{Identifying build issues.}
To find build issues, that are truly build issues to the developer, and not only reported as failed due to failing test cases that were meant to fail, we code the communication recorded in the IRC channels, work item and email discussions, and the developer diaries with respect to problems experienced in the project.
From those coded problems, we select all problems related to building the software.
The students mentioned three builds in greater detail in their development diaries and chat sessions.
\paragraph{Identifying the actual failed build.}
After we use the recorded communication to identify which failed builds are true failures, we identify those builds as they are shown in the Jenkins logs.
The Jenkins log showed that the students submitted a total of 103 build requests, of which 47 produced builds with failures.
These builds show what went wrong, as well as what changes are new to those builds, that potentially broke it.
\paragraph{Identify code changes.}
For each actual failed build, we infer which change-sets were newly added to the code base between the previous build and the failed build.
From a change-set we can determine the task worked on, as we require every student to annotate each submitted change-set with the work item identifier the change-set was meant to implement.
\paragraph{Check code interactions.}
After we identify the change-sets that were newly added to a build we can identify the developers that worked on the same source code files.
\paragraph{Check change relations.}
Knowing both the change-set, as well as the interaction traces that where involved in the failed builds, we can contrast and overlay those change-sets, as well as code interactions, to see if the path of different developers overlap in a way that allows us to predict the issues that were reported on by the students.
\paragraph{Check social network.}
In case we observe that the code traces that we collected could have predicted the issue, we look for cues in the communication media for the resolution of the issues to see whether the developers knew they shared a technical dependency expressed in those code traces.
Finding a relation between the code interactions and the ongoing communication during the implementation contrasted with the communication needed to resolve the issue created by the change-set is the key to building a useful recommendation.
\ \\
Instead of analyzing each build failure that is important to the developers, we focus on finding one build that let us trace the error back to the code interactions.
Moreover, it shows a connection between the code interactions and the communication needed to resolve the build issue.
\section{Findings}
\label{ch10:findings}
In this section we present the findings from our analysis.
\subsection{Build Failures That Matter}
Throughout the feedback and communication logs received from the students we observe that the most important problems to the students are those build failures that prevent other teams from continuing their work.
These issues usually happened towards the end of the sprint, as this is the time in which teams are integrating their code into a common code base.
\\ \ \\
\indent\emph{``Unexpected problems when merging the old code in, had to overload a couple methods [...] and create some new code to mesh it together. Eventually fixed with a decent solution.''} - Student comment
\\ \ \\
\indent On the other hand, most failed builds are of less interest to the students for two reasons:
(1) tests became outdated as the core functionality of the product changed considerably
and (2) the product owner was deciding what features are of acceptable quality regardless of the test case results.
Since most of the work preformed by the students changed, or extended, core functionality, old test cases started to fail.
Because of this the product owner is deciding on whether a feature is deployable based on a demo, and manual testing lead test cases to loase value with respect to deciding whether a feature is of acceptable quality.
\begin{table}[t!]
\centering\small
\caption{Overlapping code interaction traces that lead to a merge conflict}
\begin{tabular}{@{\hspace{2pt}}l@{\hspace{2pt}}}
\toprule
Angelique's Method Trace for Task ... \\%
\midrule
HourEntryBusiness.java-calculateWeek-Sum-QLocalDate \\% 2012-02-12 18:01:59.344\\
HourEntryBusiness.java-getDailySpentEffortByInterval-QDateTime-QDateTime \\% 2012-02-12 18:51:43.353 \\
HourEntryBusiness.java-getDailySpentEffortByIteration-QIteration \\% 2012-02-12 17:59:43.698 \\
HourEntryBusiness.java-getDailySpentEffortByWeek-QLocalDate \\% 2012-02-12 18:50:46.942 \\
%HourEntryBusiness.java-getDailySpentEffortByWeek-QLocalDate \\% 2012-02-12 17:45:23.787 \\
HourEntryBusiness.java-getEntriesByUserAndDay-QLocalDate \\% 2012-02-12 18:02:53.219 \\
HourEntryBusiness.java-logBacklogEffort-QHourEntry-QSet \\% 2012-02-12 17:59:33.333 \\
HourEntryBusinessImpl.java-calculateWeekSum-QLocalDate \\% 2012-02-12 18:02:04.738 \\
HourEntryBusinessImpl.java-getDailySpentEffortByInterval-QDateTime-QDateTime \\% 2012-02-12 20:29:30.971 \\
%HourEntryBusinessImpl.java-getDailySpentEffortByInterval-QDateTime-QDateTime \\% 2012-02-12 19:00:39.81 \\
HourEntryBusinessImpl.java-getDailySpentEffortByIteration-QIteration-QDateTimeZone \\% 2012-02-12 18:00:09.826 \\
HourEntryBusinessImpl.java-getDailySpentEffortByWeek-QLocalDate \\% 2012-02-12 19:25:18.785 \\
HourEntryBusinessImpl.java-getDailySpentEffortForHourEntries-QList-QDateTime-QDateTime \\% 2012-02-12 18:56:34.409 \\
HourEntryBusinessImpl.java-getEntriesByUserAndDay-QLocalDate \\% 2012-02-12 18:02:50.675 \\
IterationBurndownBusinessImpl.java-getBurndownTimeSeries-QList-QLocalDate-QLocalDate \\% 2012-02-12 19:10:16.384 \\
IterationBurndownBusinessImpl.java-getCurrentDaySpentEffortSeries-QList-QDateTime \\% 2012-02-12 18:19:11.223 \\
IterationBurndownBusinessImpl.java-getEffortSpentDataItemForDay-QDailySpentEffort \\% 2012-02-12 19:15:17.272 \\
IterationBurndownBusinessImpl.java-getEffortSpentTimeSeries-QList-QDateTime-QDateTime \\% 2012-02-12 19:01:14.579 \\
ProjectBusinessImpl.java-getProjectData \\% 2012-02-11 15:23:30.342 \\
\emph{ProjectBusinessImpl.java-retrieveLeafStories-QStoryFilters} \\% 2012-02-11 15:44:00.868 \\
StoryHierarchyBusinessImpl.java-moveUnder-QStory-QStory \\% 2012-02-28 20:34:29.65 \\
HourEntryDAOHibernate.java-getHourEntriesByFilter-QDateTime-QDateTime \\% 2012-02-15 11:30:28.254 \\
DailySpentEffort.java-setDay-QDateTime \\% 2012-02-12 18:10:38.683 \\
StoryHierarchyAction.java-moveMultipleAfter \\% 2012-02-28 21:38:09.445 \\
StoryHierarchyAction.java-moveMultipleBefore \\% 2012-02-28 21:39:41.965 \\
StoryHierarchyAction.java-moveMultipleUnder \\% 2012-02-28 21:37:56.006 \\
StoryHierarchyAction.java-moveStoryUnder \\% 2012-02-25 13:00:26.604 \\
StoryHierarchyAction.java-recurseHierarchyAsList \\% 2012-02-25 13:00:15.767 \\
% \bottomrule
%\end{tabular}
%\label{tab:overlappingtraces}
%\caption{Overlapping code interaction traces that lead to a merge conflict.}
%\end{table}
%
%
%\begin{table}[t!]
%\centering\small
%\begin{tabular}{l}
%\toprule
\midrule
\midrule
Bernard's Method Trace for Task ... \\% Timestamp \\
\midrule
BacklogHistoryEntryBusinessImpl.java-setStoryHi-erarchyDAO-QStoryHierarchyDAO \\% 2012-02-10 13:40:28.454 \\
BacklogHistoryEntryBusinessImpl.java-updateHistory \\% 2012-02-10 13:40:48.593 \\
ProductBusinessImpl.java-store-QProduct \\% 2012-02-23 12:01:52.557 \\
ProjectBusinessImpl.java-persistProject-QProject \\% 2012-02-23 12:04:28.878 \\
ProjectBusinessImpl.java-store-QInteger-QProject-QSet \\% 2012-03-14 14:02:37.353 \\
%14 \\
%ProjectBusinessImpl.java-store-QInteger-QProject-QSet \\% 2012-03-14 15:33:12.686 \\
\emph{ProjectBusinessImpl.java-retrieveLeafStories-QStoryFilters} \\% 2012-02-23 11:59:40.577 \\
StoryBusinessImpl.java-createStoryRanks-QStory-QBacklog \\% 2012-03-14 13:55:45.079 \\
%46 \\
%StoryBusinessImpl.java-createStoryRanks-QStory-QBacklog \\% 2012-03-14 14:10:16.51 \\
StoryBusinessImpl.java-store-QInteger-QStory-QInteger-QSet \\% 2012-03-14 13:57:53.526 \\
%10\\
% StoryBusinessImpl.java-store-QInteger-QStory-QInteger-QSet \\% 2012-03-14 14:07:20.516 \\
StoryWidget.java-getStory \\% 2012-02-09 09:13:07.167 \\
UserLoadWidget.java-getUser \\% 2012-02-22 19:55:33.581 \\
\bottomrule
\end{tabular}
\label{tab:overlappingtraces}
\end{table}
\subsection{Preventable Build Failures}
Table~\ref{tab:overlappingtraces} shows two traces that directly link to one of the major build failures caused by merge conflicts.
Similarly, the two traces representing the change-sets point to the file responsible for the merge conflict.
We chose this particular failed build for three reasons:
(1) this failure occurred early on in the development and was described in most detail by the effected students,
(2) it involves members from different student teams further illustrating issues with cross-team communication,
and (3) it is representative for both the way we characterize technical dependencies (two developer change the same file) and it is the failure type most students complained about in their development diaries.
Each trace contains a list of methods ordered by when the developer selected or edited a method.
The two traces shown in Table~\ref{tab:overlappingtraces} only show edit events, and removed redundancies, when multiple subsequent edit events in the same method occurred.
The two traces overlap by both editing a common function, \emph{ProjectBusinessImpl.java-retrieveLeafStories-QStoryFilters} (emphasis added in the Table), which turned out to be the offending function that causes the build to fail.
Note that the traces in Table~\ref{tab:overlappingtraces} were recorded during the same sprint but shifted in time.
Furthermore, the owners of the two traces, and thus change-sets, belong to different teams whose work is supposed to be integrated towards the end of the sprint.
\subsection{The Right Recommendations}
We saw that most of the build failures that are of high relevance to the developers deal with issues arising from merging code from several teams.
Although when such a build failed, communication among team members spikes, it is confined to the team that merges their code last and thus experiences the build failure once they merged their code.
Thus, the communication that is ongoing in order to resolve the build issues does not necessarily include the people that could have helped to prevent the build from failing.
Nevertheless, the recordings and communication after the merge conflict was resolved allows for a better picture of the people involved.
After the issue was resolved, the team fixing the issue voiced the concern in the following retrospective and sprint planning as well as contacted the respective team.
From the communication it became apparent that the issue could have been prevented if the authors of the conflicting change-set would have known about each other's change.
Using our approach, and given the data we collected from the developer traces, we could have provided a recommendation that would have prevented the build failure by making the developers aware of their overlapping work.
\section{Discussion}
\label{ch10:dis}
In this chapter, we set out to answer the question of whether it is possible to make recommendations to prevent build failures before they happen.
We found that the methods proposed by Blincoe et al.~\cite{blincoe:cscw:2012} gathering real time code interactions using Mylyn to make the post-mortem analysis performed in the traditional socio-technical congruence context~\cite{kersten:aosd:2005} actionable is useful for providing real time recommendations.
Although we did not show the usefulness of using the Blincoe et al.~\cite{blincoe:cscw:2012} approach by implementing a recommender system leveraging the data, nonetheless, the data is in place and shows promise to resolve certain issues before they become problematic.
The course experiment showed that there is a link between the build failures that stems from technical dependencies that can be inferred while developers are working to implement their work items.
\section{Summary}
\label{ch10:con}
We brought this chapter to a close by returning to the initial research question that we had set out to answer:
\begin{description}
\item[RQ 2.3:] Can recommendation actually prevent build failures?
\end{description}
Based on data that can be gathered in a team setting, we showed that recommendations preventing build failure can be generated in a classroom setting.
The student in the globally distributed software engineering course experienced the very build failures that the approach we presented in Chapter~\ref{chap:approach}.
Through interviews and analyzing the students development diaries we could identify a number of build failures that matters to the students.
For a representative build, we traced the cause of the build failure through the coding interactions we recorded from the students, and were able to find that the cause for the build failure stemmed from modifying the same file.
This lends further evidence that the technical dependencies we used in Chapter~\ref{chap:stc-net} to generate actionable insights.
%
We did not provide the actual recommendations but instead traced the build issue in one representative case back to coding errors that could have been uncovered if the respective developers would have talked about their changes.