-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathconst-meth.tex
410 lines (335 loc) · 35 KB
/
const-meth.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
% !TEX root = thesis.tex
\startchapter{Methodology and Constructs}
\label{chap:meth}
Before we dive into our actual studies of the effect of socio-technical congruence and its use to form recommendations, we present the overall roadmap for this dissertation (cf. Section~\ref{c5:sec:roadmap}) and some of the common definitions (cf. Section~\ref{c5:sec:definitions}) and constructs (Section~\ref{c5:sec:constructs}) that are used throughout the dissertation.
Furthermore, we will discuss the general approach to the data collection methods that are employed (cf. Section~\ref{c5:sec:datacollection}).
\begin{figure}[h!]
\centering
\includegraphics[height=.9\textheight]{./figures/roadmap}
\caption{What chapter addresses which research questions in relation to our approach to improve social interactions among software developers.}
\label{fig:roadmap}
\end{figure}
\section{Methodology Roadmap}
\label{c5:sec:roadmap}
In this section, we discuss the methods that were applied in order to answer the research questions presented in Chapter~\ref{chap:bg}.
Figure~\ref{fig:roadmap} depicts the relationship between the research questions and the contribution of this dissertation, an approach used in attempt to improve social interactions among developers by characterizing the quality of interactions by the build outcome of the related build.
Research questions~1.1 and~1.2 discussed in Chapters~\ref{chap:soc-net} and~\ref{chap:stc-net2} motivate the approach.
Research questions~2.1 and~2.3 (cf. Chapters~\ref{chap:stc-net} and~\ref{chap:actionable}) explore whether socio-technical networks can be used to form recommendations that can then prevent build failures, whereas research question~2.2 inquires in Chapter~\ref{chap:talk} whether such recommendations are acceptable by developers.
\subsection{%Research Question 1.1}
%Motivation
%\begin{description}
RQ 1.1: Do Social Networks influence build success?}
%\end{description}
This dissertation's goal is to design an approach that is able to improve the social interactions in the form of communication among software developers.
As a first step, we need to establish if the communication among software developers has an influence on the build success.
%Data Source
We were allowed access to the development repositories used by the IBM Rational Team Concert development team such as their source code management system, communication repositories in the form of work item discussions, and their build results.
All these artifacts are linked together in a way that allows us to trace from the build result which changes went into the build and which work items a change is meant to implement.
%Methods
Using this information, we can construct social networks from all of the work items that are related to builds.
These networks are then described using social network metric and form the input for machine learning algorithms to predict whether a build based on these metrics is more likely to fail or succeed.
%Expected results
Via this machine learning approach we want to establish a connection between a build's social network and its outcome.
If we are able to predict the build outcome more accurately than by simply guessing, using the likelihood for a build failure, we demonstrate that there is a statistical relationship between build outcome and social networks.
This result forms the first evidence that manipulating the social network might yield a positive effect on build success.
\subsection{%Research Question 1.2}
%\begin{description}
RQ 1.2: Do Socio-Technical Networks influence build success?}
%\end{description}
%Motivation
Knowing that a social network can influence the success of the corresponding build leads us to question how networks should be manipulated in order to improve the likelihood for a build to succeed.
Therefore, we explore the relationship between socio-technical networks generally and gaps within that networks, with a gap being formed by two developers that share technical dependencies but failed to communicate about work related to the build of interest and build success.
%Data Source
Similarly to the previous research question, we based the analysis on the same data set, allowing us to directly infer technical relationships with developers related to a software build from the changes submitted to the source code management tool.
We used these changes previously to infer the work items developers used to communicate among each other about the build.
%Methods
Since the socio-technical networks have two semantically different edges connecting two developers within a network (technical dependencies and communication among developers) we refrain from using social network metrics as they assume only one mode of connection among nodes within a network.
Instead we investigate the relationship of the socio-technical congruence index and build success as well as focusing on the influence of gaps in the network on build success.
%Expected results
Via statistical analysis methods such as regression analysis we want to establish a relationship between the existence of gaps within the socio-technical network and build success.
By addressing this research question we obtain another piece of evidence that allowed us formulate an approach to recommend actions to increase build success that are specifically alleviating gaps within the socio-technical network by recommending developers to communicate.
\subsection{%Research Question 2.1}
%\begin{description}
RQ 2.1: Can Socio-Technical Networks be manipulated to increase build success?}
%\end{description}
%Motivation
The previous two research questions enable us to formulate an approach to generate recommendations that are meant to foster communication among developers in order to increase build success.
This leads to the next step, in which we explore whether this approach can generate recommendations that show a statistical relationship to build success.
%Data Source
Using the same data source as we did earlier, we try to relate individual reoccurring gaps in socio-technical networks to build failure.
%Methods
Knowing those gaps, or developers that frequently share a technical dependency without communicating with respect to a build that failed, we check if adding a social dependency would change the likelihood of a build to fail.
%Expected results
We expect to find a number of gaps that, when mitigated, increase the likelihood of build success.
\subsection{%Research Question 2.2}
%\begin{description}
RQ 2.2: Do developers accept recommendations based on software changes to increase build success?}
%\end{description}
%Motivation
Before exploring whether the recommendation holds actual value concerning the prevention of build failures, we explore whether developers would welcome recommendations with respect to changes.
%Data Source
To do this, we joined the development effort of one of the Rational Team Concert development teams as participant observer to get an insight into how the actual developer communicate during their day to day work.
%Methods
We complement these observations using followup interviews in order to gain a better understanding of the team dynamics and their discussion topics, since as a project newcomer our work is limited to more basic tasks in contrast to higher level decision making.
To extend our reach beyond the local team we deployed a questionnaire to the product team at large to gain a better understanding whether the recommendations we would supply could be easily integrated in their typical discussions with fellow developers.
%Expected results
This study should give us a better understanding of whether developers are discussing individual changes, thus justifying the appropriateness of the level of recommendations.
Furthermore, we expect to uncover general suggestions on when and how to supply such recommendations as developers might not always be interested in individual changes even when they may pose a threat to build success.
\vspace{-5pt}
\subsection{%Research Question 2.3}
%\begin{description}
RQ 2.3: Can recommendations actually prevent build failures?}
%\end{description}
%Motivation
\vspace{-7pt}
We conclude our evaluation of our proposed approach with a proof of concept in a more controlled environment to ascertain whether the recommendation we can generate actually does help in preventing builds from failing.
As root cause analysis of failures can be very tedious, and due to the various reasons a build can fail that are not necessarily due to software changes, we depart from studying the development of Rational Team Concert and observe students extending an open source project during a course taught at the University of Victoria, Canada, and Aalto University, Finland.
%Data Source
This change in setting allows us to ask the study participants to use tools we developed to collect more fine grained data on their development efforts, such as when they edited which parts of the source code.
Additionally, we have access to the actual source code repositories such as source code management, issue tracking, email, and text chat sessions.
Furthermore, we ask the students to answer regular questionnaires and keep a development diary that are meant to collect information on issues that the students encounter during the course and are specifically related to their development effort.
%Methods
We then analyze the collected data from the questionnaires and development diaries manually, by reading and annotating the different responses and entries to uncover build issues that were important to the students.
Knowing the build issues that the students encountered, we select the most severe in terms of the number of reports, and trace them to the actually root cause, by identifying the code changes that cause them (if applicable) and continue to analyze the offending changes as well as identifying corresponding communication.
%Expected results
We expect to find one representative example of a failed build that the students encounter during the course that can be resolved using recommendations based on information that is available before the students actually incorporate their modifications in to the source code repository of the project.
\section{Definitions}
\label{c5:sec:definitions}
We start by giving our definitions of the three constructs that we are heavily relying upon: (1) Work Item being a complete unit of work, (2) Change-set being the technical work submitted by a developer, and (3) builds referring to a testable product.
\subsection{Work Item}
A \emph{Work Item} is a unit of work that can be assigned to a single developer.
A unit of work can be anything from a bug-report, reported by either the end user or by a developer, to a feature implementation.
\emph{Work Item}s can be hierarchically organized to show the work breakdown from high-level requests to manageable pieces.
One project team member is responsible for a work item to be completed, the sub work items that it is broken-down to do not necessarily need to be assigned to the owner of the parent work item.
For instance, in the case of the IBM Rational Team Concert the development team creates story items to describe larger functionality from the user point of view and assigns them depending on the complexity and implication of the story.
The owner of the story then either breaks down the story into multiple stories or tasks that are again assigned to team members according to their complexity and implications.
Once the work item level is sufficiently low, the developer assigned to it can make the necessary modifications to the project to accomplish the work detailed in the work item.
\begin{note}
\begin{mydef}
A \emph{Work Item} is a defined and assignable unit of work.
\end{mydef}
\end{note}
\subsection{Change-Set}
A \emph{Change-Set} is a set of source code changes applied to a number of source code files, with a file being the artifact that a developer would change to add to, modify, or delete from the current product.
The developer that applied those changes to the product bundles them into one or multiple change-sets.
For example, in the Eclipse project\footnote{\url{http://www.eclipse.org}} the developers use CVS\footnote{\url{http://www.nongnu.org/cvs}} as their version control system to manage changes to the Eclipse IDE.
A developer will check out their current version of the repository and started editing, creating and deleting files in order to fulfill a work item she is currently working on.
Once the developer decides that she has accomplished the work to finish her current work item, she commits her changes.
Those changes that consist of file creations, deletions, and modifications taken together are referred to as a \emph{Change-Set}.
\begin{note}
\begin{mydef}
A \emph{Change-Set} is a set of modifications, additions, and deletions of software artifacts such as source code files, classes or methods.
\end{mydef}
\end{note}
\subsection{Build}
The goal of each software development team is to deliver a finished or improved product at some point in time.
This finished product is often referred to as the final \emph{Build}.
A \emph{Build} can generally be referred to as any instance of the product that can be run to some extent.
To create a build, a team gathers all of the changes implementing work items, that are required for the new build and compiles and packages the product.
The amount of work items and their respective changes included in a build, will gradually increase over time since as the project progresses more work will be completed.
In the case of the IBM Rational Team Concert development team, builds are created on a frequent basis to test the product as a whole in order to look for integration issues.
The team also subscribes to the philosophy to use their own products and thus tries to bring each build to a level at which it can then be used for development.
This intense use enables the team to spot issues that are still within the product and enables them to assess the severity of these issues.
\begin{note}
\begin{mydef}
A \emph{Build} is to some extent a executable version of the product that includes a number of changes implementing work items.
\end{mydef}
\end{note}
\section{Constructs}
\label{c5:sec:constructs}
From the definitions introduced previously we can derive the three central constructs that we work with in this dissertation: (1) the social network connecting communicating and coordinating developers, (2) the technical network connecting developers that are dependent through code artifacts, and (3) the socio-technical network that combines the social and technical network in a meaningful way.
These constructs are important for the three chapters that are mining the repository provided by the Rational Team Concert development team (cf. Chapters~\ref{chap:soc-net},~\ref{chap:stc-net2}, and~\ref{chap:stc-net}).
\subsection{Social Network}
\begin{figure}[t!]
\begin{center}
\includegraphics[height=1.3\textwidth]{./figures/grand_figure}
\caption{Social network construction examples in our approach}
\label{fig:network}
\end{center}
\end{figure}
To illustrate our approach to construct social networks we go through the example of a failed build illustrated in Figure~\ref{fig:network}.
A social network is represented as a graph that consists of nodes connected by edges.
In our approach, the nodes represent people and edges represent task-related communication between these people.
The approach is repository and tool independent and can be applied to any repositories that provide information about people, tasks, technical artifacts, and communication, this includes work, issue, or change management repositories, such as BugZilla or IBM Rational Team Concert; or source code management systems, such as CVS or IBM Rational Team Concert; or even communication repositories such as email archives.
We construct and analyze social networks within a collaboration scope of interest, using a collaboration scope defining the people and interactions of interest.
In this example, around Failed Build 1, the collaboration scope is the communication of the contributors to the failed build.
Other examples include the collaboration of people working on a critical task, in a particular geographical location, or in a functional team such as testing.
There are three critical elements that are necessary to construct task-based social networks for a collaboration scope and that need to be mined from software development repositories:
\begin{description}
\item[Project Members] are people who work on the software project.
These project members can be developers, testers, project managers, requirements analysts,
or clients.
Project members, such as Cathrin and Eve, become nodes in the social network.
\item[Work Items] are units of work (as defined earlier) within the project that may create a need to collaborate and communicate.
Examples for a work items include resolving Bug~123 or implementing the GUI API.
More generally, implementing feature requests and requirements can also be considered collaborative tasks.
\item[Work Item Communication] is the information exchanged while completing a work item and is the unique information that allows us to build task-based social networks.
In our example, dashed black lines represent task-related communication such as a comment on Bug 123, or an email or chat message about GUI API.
Task-related communication is used to create the edges between developers in the social networks.
\end{description}
The data underlying the social network used throughout this dissertation is based on work items and their associated discussions.
In IBM Rational Team Concert each work item has an attached discussion thread where developers can discuss the work item or simply note down their thoughts while working on the work item.
This means, we would create a link between two developers if they comment together on the same work item to indicate that they are part of the same discussion.
Note that this section draws heavily from our work done in collaboration with Timo Wolf, Daniela Damian, Lucas Panjer, and Thanh Nguyen~\cite{wolf:ieee:2009}.
\subsection{Technical Network}
\begin{figure*}[t!]
\centering
\includegraphics[width=.7\textwidth]{figures/cochangedfiles}
\caption{Creating a technical network by connecting developers that changed the same file.}
\label{fig:addtechnicaledge}
\end{figure*}
% some preamble as in the previous subsection
Building technical networks follows a very similar approach as we described for building social networks.
In fact, the technical network is a social network whose main distinction from the social network lies in the way edges between nodes are created.
We derive the name of technical networks from the way we link developers together, namely if they are modifying related source code artifacts.
As in the previous network construction, the construction of the technical network is based on three components:
\begin{description}
\item[Project Members] are people who work on the software project.
These project members can be developers, testers, project managers, requirements analysts,
or clients or in general anyone that modifies software artifacts through change-sets.
In Figure~\ref{fig:addtechnicaledge} project members, Alfred and Bob, become nodes in the technical network because they modified the same file.
\item[Change-Sets] are changes made to software artifacts by individual users.
A set consists of a number of artifacts that have been modified as well as the modifications themselves.
For example, Alfred as shown in Figure~\ref{fig:addtechnicaledge} modified File$_{\text{A}}$ and File$_{\text{B}}$.
\item[Software Artifact Relation] describes the relation between developers in the technical networks.
This relationship can be defined in several different ways.
For example, in Figure~\ref{fig:addtechnicaledge} Alfred and Bob are related through a technical relationship because they modified the same file.
Note that we are mostly interested in relationships between artifacts that are affected by changes.
\end{description}
% Example as shown in the picture
Therefore, constructing technical networks follows three steps: (1) gather all change-sets of interest, (2) identify the relations between artifacts, (3) infer the relation between the artifact owners using the change-sets and the relations between the source code artifacts.
For example, after having selected the set of change-sets of interest we define the change-sets themselves as the source code artifact and identify the owners of those artifacts.
We then infer the relationship between those source code artifacts by relating all change-sets that affect the same source code file.
In the case of Alfred and Bob, this means that they are connected because both own a change-set that modifies the same file (cf. Figure~\ref{fig:addtechnicaledge}).
\subsection{Socio-Technical Network}
\begin{figure}[t!]
%
\centering
\subfloat[Inferring to the build focus relevant change-sets and work items.]{
\includegraphics[width=.46\textwidth]{figures/buildworkitem}
\label{fig:construct-focus}
}
%
\hspace{8pt}
\subfloat[Constructing an social networks from work item communication.]{
\includegraphics[width=.46\textwidth]{figures/buildsn}
\label{fig:construct-sn}
}
\subfloat[Linking developers in a technical networks via change-set overlaps.]{
\includegraphics[width=.46\textwidth]{figures/cochangedfiles}
\label{fig:construct-tn}
}
%
\hspace{8pt}
\subfloat[Combine social and technical networks into a socio-technical network.]{
\includegraphics[width=.46\textwidth]{figures/stc-net}
\label{fig:construct-combine}
}
\caption{Constructing socio-technical networks from the repository provided by the IBM Rational Team Concert development team.}
\label{fig:construct-stc}
\end{figure}
Socio-Technical networks are a meaningful combination of both social and technical networks.
Selecting this meaningful combination reflects itself in the selection of the work-items in the case of building the social network and selecting the change-sets and their relations in the case of the technical network.
Hence, constructing a socio-technical network requires the following four steps:
\begin{enumerate}
\item\textbf{Selecting the Focus} used for the socio-technical network represents the glue that binds the social and technical network into a socio-technical network.
This focus, also referred to as filter in our earlier publication~\cite{wolf:ieee:2009}, determines the content of the networks.
In this dissertation, we select as the focus software builds to construct networks that describe the coordination among developers for a given software build.
\item\textbf{Constructing the Social Network} follows the description above with the focus determining the work items that are selected in order to generate the nodes from the work item participants and the edges from the communication among the participants through a work item.
\item\textbf{Constructing the Technical Network} follows the description of constructing technical networks above, with the focus determining the change-sets being used to determine and connect developers in the network.
\item\textbf{Combining Networks} is the final step that overlays the networks by unifying the set of developers.
Thus, a pair of developers can be directly connected to one another through two edges, one representing the edge from the technical network and the other the edge from the social network.
\end{enumerate}
% describe the example in the figure
Figure~\ref{fig:construct-stc} shows an example of how we, in our studies of the IBM Rational Team Concert development team, create socio-technical networks.
In the first step (cf. Figure~\ref{fig:construct-focus}) we set the focus on a software build which allows us (via the change-sets that made it into the build) to infer what work items are also represented in said build.
Given the focus, the social network can be constructed using the work items that can be linked to the software build (cf. Figure~\ref{fig:construct-sn}).
Similarly, the construction of the technical network relies on the change-sets that went into a build.
To actually infer edges between developers, we relying on co-changed files within a build as an indicator of work dependency (cf. Figure~\ref{fig:construct-tn}).
Finally, the two networks are combined and yield the socio-technical network shown in Figure~\ref{fig:construct-combine}.
\section{Data Collection Methods}
\label{c5:sec:datacollection}
To conduct the research for this dissertation we drew upon multiple data sources.
We employ repository-mining techniques to identify larger trends in measurable activities.
This allows us to gain a more in-depth understanding in on how developers actually work and deal with interdependencies, especially how they would react to certain recommendations and whether they are can be made useful we employ qualitative methods.
\subsection{Repository Mining}
Software development usually uses a number of tools to manage information electronically, such as version archives and issue trackers.
Additional to storing source code and tasks/issues, these software repositories are also able to contain digital communication, such as forum and email discussions, logs of IRC\footnote{\url{http://en.wikipedia.org/wiki/Internet_Relay_Chat} last visited June 8th, 2012}, and instant messenger chats.
Repositories can grow to considerable sizes depending on the projects life span and intensity.
Therefore, it is often not possible to manually review the history of a project and it is then necessary to employ data-mining techniques in order to analyze trends.
Although this approach is limited in terms of gaining a deeper understanding of the intricacies of a development project, it nevertheless is invaluable to place in-depth results in the bigger picture of the project.
Furthermore, data mining approaches are one way to easily give back value to the development team without burdening any individual developer by diverting time to other non-automatic data collection instruments.
This is an important point, as one goal of this dissertation is to explore the ability of the concept of socio-technical congruence to generate actionable recommendations.
In case a developer needs to personally provide a large amount of information manually, the overhead generated by a system might outweigh the benefit of recommendations and therefore render the system useless.
We extract information from three different types of repositories: (1) version control, (2) task management, and (3) build engine systems.
The version control supplies us with the knowledge of how developers are connected through their technical work.
The task management supplies us with information on which individuals were communicating with respect to a specific work item, and lastly the build engine supplies us with the focus to construct socio-technical networks.
In order to derive socio-technical networks we need to link the different artifact types.
Within IBM Rational Team Concert, as illustrated in Figure~\ref{fig:construct-focus}, work items are linked to change-sets and change-sets are linked to builds, therefore, establishing the connections needed to construct socio-technical networks with a build as focus.
Similarly, these links can be inferred, as proposed by Cubranic et al.~\cite{cubranic:tse:2005}, from repositories that are missing the capabilities of creating formalized links.
We used repository mining techniques in Chapter~\ref{chap:soc-net},~\ref{chap:stc-net2}, and~\ref{chap:stc-net} to explore the rich repositories provided by the IBM Rational Team Concert team.
\subsection{Surveys}
To complement the insights that were obtained from mining repositories we use surveys.
Surveys are designed iteratively and piloted before deployment.
They are intended that will collect input to enrich and clarify information obtained from the software repositories.
With each survey we try to minimize the time each developer needs to spend completing them, which usually limits ourselves to focus on closed questions offering prepared answers.
We constrain ourselves in this way to minimize the distraction to each individual developer and thus increase the response rate.
Our surveys are deployed through web services in an attempt to make the collections more convenient to each developer as they are spending most of their times working at a computer.
Keeping track of a paper version is more cumbersome as they might not easily be returned, especially considering that the development teams we are collaborating with are distributed across different continents.
We both deployed surveys in an attempt to make with the Rational Team Concert development team (Chapter~\ref{chap:talk}) and when working with students on a large course project at the University of Victoria, Canada, and Aalto University, Finland (Chapter~\ref{chap:actionable}).
\subsection{Observations}
The next richer, and also to the developer more distracting method, of data gathering are observations.
The act of observing can distract developers and also change their behaviour, although we do not actively interrupt or distract developers.
In order to minimize this type of distraction and to mitigate the observer bias, we employed a special form of observation study known as participant observation.
In short, we became both an observer and a participant.
%
This has a multitude of advantages:
\begin{itemize}
\item\textbf{Reciprocity.} By participating in the actual development we can provide value to the development team from the very beginning.
This, in turn, motivates the developers to give us the time we need to conduct other parts of the study, like surveys and interviews.
\item\textbf{Learning the Vocabulary.} Each development project has its own project vocabulary~\cite{espinosa2007:team_knowledge} in order to effectively and clearly communicate.
Understanding this vocabulary as an outsider can be difficult, but is definitely very important when it comes to making sense of comments.
\item\textbf{Understanding the Context.} For example, in one study, our observation period coincided with the months prior to a major release during which the team focused on extensive testing rather than new feature development.
% something about the context helping
Due to the proximity of the observation period to a major release we observed mainly activities around integration testing with little coding activity aside from fixing major bugs.
Although it is easy to ascertain when the next major release of the product is, the affect this has on the developer, besides a change in the process, is harder to gauge.
Being part of the development effort allowed us to better understand how developers react to the change in process and better understand their struggles.
\item\textbf{Asking more Meaningful Questions.} A better understanding of the project and how it affects the individual developer as well as gaining a better understanding of the vocabulary helps with phrasing better questions in the sense of both more meaningful to the developer.
\end{itemize}
Besides gaining a better understanding of some easily missed (or miss-understood) intricacies, working together with the developer establishes a trust relationship~\cite{letherbridge:ese2005}.
This trust helps to mitigate observation biases that are introduced solely by observing as well as prompts developers to be more forthcoming during interviews and surveys~\cite{letherbridge:ese2005}.
It is difficult to separate the different data collection methods that involve human interaction; therefore, we think combining these data collection methods in the right order can greatly enhance the quality of the collected data.
We were able to join the Rational Team Concert development team as a participant observer mainly due to the convincing argument that we take the place of an intern, thus, contributing to their development effort (Chapter~\ref{chap:talk})
\subsection{Interviews}
To further enhance our understanding of how developers view the situation and further make sense of survey responses as well as results from mining repositories and our observations, we employed interviews.
Instead of following a structured interview approach, we opted for a semi-structured interview with a focus on war stories.
War stories~\cite{lutters:ist:2007} ask the interviewee to share memorable stories from work life.
The interviewer can explore these war stories and help shape the focus of the discussion of the events.
This type of interview comes with two major benefits over structured interviews that follow a set of questions:
\begin{itemize}
\item\textbf{Focus onto for the interviewee important events.}
Our goal with this dissertation is to better support software development.
Knowing the pain points, as the projects participants perceive them, allows us to focus on important issues.
With prepared questions, the focus of the interview might not uncover what is important to the interviewee and thus we might miss areas.
\item\textbf{Better recall of events by interviewee.}
Recall of important events is better than arbitrary ones~\cite{lutters:ist:2007}.
This allows us to place more confidence on the reports and answers given by the interviewees.
Dissimilarly, in structured interviews, the interview itself is guided by a set of prepared questions and goals that do no necessarily address events that the interviewee can easily recall.
\end{itemize}
The main drawback of using war stories over prepared interview questions in a structured interview framework lies with the loss of focus of the interviews.
By asking the interviewee to tell war stories of memorable events it can be more difficult to gain insight into a particular area of interest if the war stories veer too far off the topic.
It is therefore necessary that the interviewer have a good understanding of the project and the project language in order to explore the stories for relevance in terms of the topics of interest, thus making the process more demanding for the interviewer.
% talk about the time of the interviews
We tried to minimize the interruption to project members as much as possible.
To do this, we limited the time we require for each interview as much as possible.
We aimed to make each interview approximately 30 minutes with a 30-minute overflow in case a participant desires to continue the interview.
Furthermore, we gave the work of the interviewee priority over the interview, and assured the interviewees that they could stop the interview if at any point they felt that their work needed attention.
This was especially valuable with a professional development team such as the IBM Rational Team Concert development team as we joined their development effort when they were nearing a major milestone (cf. Chapter~\ref{chap:talk}).
\section{Summary}
Throughout this dissertation, we present our methodology and constructs.
We started with a roadmap describing the methodology used to answer our five research questions.
Subsequently, we gave our definitions of what we refer to as a work item, change-set, and a build.
These three definitions build the foundation that enables us to conceptualize social interaction and technical dependencies into networks of developers.
We further described the methods we used to elicit our data while we conducted our studies.
Trying to minimize the interruption with actual development, we focus on exploiting mining repository techniques.
When interacting with developers we use a mixed method approach which consisting of observations, surveys, and interviews.