Farseerfc's Nest - msr//farseerfc.me/en/2012-06-02T10:42:00+09:00MSR 2012 @ ICSE2012-06-02T10:42:00+09:002012-06-02T10:42:00+09:00farseerfctag:farseerfc.me,2012-06-02:/en/msr2012.html
<div class="section" id="mining-software-repository-2012-icse">
<h2><a class="toc-backref" href="#id1">Mining Software Repository 2012 @ ICSE</a></h2>
<p>I participated MSR of this year. We came to University of Zurich early
in the morning. The registration got something wrong when it seems that
Swisses cannot tell the difference among Asians so that name cards of
3 Chinese with family name of Yang are …</p></div>
<div class="section" id="mining-software-repository-2012-icse">
<h2><a class="toc-backref" href="#id1">Mining Software Repository 2012 @ ICSE</a></h2>
<p>I participated MSR of this year. We came to University of Zurich early
in the morning. The registration got something wrong when it seems that
Swisses cannot tell the difference among Asians so that name cards of
3 Chinese with family name of Yang are misplaced. And also the
organization field of Hotta was "Japan, Japan", as if he represented
the Japan.</p>
<div class="section" id="msr-microsoft-research-talk-msr-mining-software-repositories">
<h3><a class="toc-backref" href="#id2">MSR(MicroSoft Research) talk @ MSR(Mining Software Repositories)</a></h3>
<p>The first talk was the keynote given by Mrs Zhang from MSR(MicroSoft
Research @ Asia), so it turned out to be MSR gave keynote of MSR.
The talk was about Software Analysis and their clone detection tool
called XIAO. XIAO was a clone detector developed by MSRA which can be
used as a plugin for Microsoft Visual Studio. XIAO has two part, or
system state: the statics state analysis all the clones which didn't
consider the running time, while the dynamic state need real time response.
The thing I need to develop for Samsung is something like dynamic mode.
I wanted to know more about the internal details about XIAO but the talk
was finished there.</p>
</div>
<div class="section" id="towards-improving-bts-with-game-mechanisms">
<h3><a class="toc-backref" href="#id3">Towards Improving BTS with Game Mechanisms</a></h3>
<p>The contents of this talk is very much like this blog:</p>
<p><a class="reference external" href="http://www.joelonsoftware.com/items/2008/09/15.html">http://www.joelonsoftware.com/items/2008/09/15.html</a></p>
<p>The talk discussed whether the same game mechanism can be applied to
the things like issue tracking or similar. From my point of view, it
is useless to use game mechanism in this situation. The reason that
stackoverflow can success lies on that they just captured the use of
fade system in opensource community, as all hackers like to be approved
as great hacker, as what is happening in Wikipedia. Whether the same
theory can be applied in issue tracking systems inside a internal
company is questionable. Although MSDN has basic the same structure
as Wikipedia, the content of MSDN and Wikipedia have different
involvement of users. So I myself didn't approve this research.</p>
</div>
<div class="section" id="ghtorrent">
<h3><a class="toc-backref" href="#id4">GHTorrent</a></h3>
<p>They slide of this talk can be found from here:
<a class="reference external" href="http://www.slideshare.net/gousiosg/ghtorrent-githubs-data-from-a-firehose-13184524">http://www.slideshare.net/gousiosg/ghtorrent-githubs-data-from-a-firehose-13184524</a></p>
<p>Data exporter for github. Main part of data of Github, namely the hosted
code, are already exposed as git repos, and wiki of repos are stored in
git repo. So the aim of this project is to expose other data such as
issues, code comments, etc. The project access github api and fetch the
needed data as distributed system in order to overcome the limitations
of the github api. The project will provide download history as torrents.
The json data from github api is stored as bson in MongoDB and the parsed
data is stored in MySQL with schema.</p>
<p>From my point of view, it will be better if the format of data can be
uniformed and all data are stored in the git repo as wiki pages.
As the history stored in git repo is more nature, and using <pre><span class="code-line"></span>
<span class="code-line">git blame</span></pre>
to trace author of code comments should also be more useful. Of course
it is harder to read and write the raw data of git as we need more
understanding of the internal format of git. Maybe only people from
github can do this.</p>
</div>
<div class="section" id="topic-mining">
<h3><a class="toc-backref" href="#id5">Topic Mining</a></h3>
<p>I can not understand the two parameters, DE, AIC, used in this research,
study this later. The experiment target of this research are Firefox,
Mylyn and Eclipse. They are trying to analysis the identifiers and
comments from source codes in software repos and find the relationship
between topics and bugs, like what kind of topics are more likely to
contain buggy codes.</p>
<p>The result of this research is not so clear. Such as it said that the
core functions of Firefox have more bug reports, but it said no reason
about this. Maybe this only means that the core features are well
tested, rather than that the core features are more buggy.</p>
<p>But the slides showed by author are pretty and easy to understand.</p>
</div>
<div class="section" id="the-evolution-of-software">
<h3><a class="toc-backref" href="#id6">The evolution of software</a></h3>
<p>The keynote talk of the second day. It is about how should we combine
the social media with software development. Maybe this is the reason
why Github succeeded. In the talk she told about accessing tags,
uBlogs, blogs etc. directly from Integrated Development Environments,
or should we need cloud IDE such as Cloud9.</p>
</div>
<div class="section" id="do-faster-releases-improve-software-quality">
<h3><a class="toc-backref" href="#id7">Do Faster Releases Improve Software Quality?</a></h3>
<p>Used Firefox as example.</p>
<p>The conclusion is that faster releases will lead to more bugs and more
frequent crash, but bugs are get fixed more quickly and user will switch
to new released more quickly.</p>
</div>
<div class="section" id="security-vs-performance-bugs-in-firefox">
<h3><a class="toc-backref" href="#id8">Security vs Performance Bugs in Firefox</a></h3>
<p>Performance bugs are regression, blocks release.</p>
</div>
<hr class="docutils"/>
<div class="section" id="some-of-my-thoughts">
<h3><a class="toc-backref" href="#id9">Some of my thoughts</a></h3>
<div class="section" id="separation-of-commits-based-on-semantic-analysis">
<h4><a class="toc-backref" href="#id10">Separation of commits based on Semantic analysis</a></h4>
<p>The user of some tools (such as git) are not following the design
purposes of these tools which brings some difficulty to MSR. For example
git has a prefect branch system, so it is desired for users of git to
commit per topic. Commit per topic means that user send a commit for a
single implementation of a feature or a bug fix, etc. If it is difficult
to contain all modifications in a commit, then it should be in a
separate branch and merged into master branch. But actually
user tends to send very large commits, that contains many logical
features, and they can not predict to open a new branch until a few
commits.</p>
<p>Maybe this is not the fault of the user of tools, this is the tools
that are not smart enough. We should separate the commits according
to the semantic topics inside a commit.</p>
</div>
<div class="section" id="about-the-slide-systems-used-today">
<h4><a class="toc-backref" href="#id11">About the slide systems used today</a></h4>
<p>The study with title <pre><span class="code-line"></span>
<span class="code-line">Incorporating Version Histories in Information</span>
<span class="code-line">Retrieval Based Bug Localization</span></pre>
used the slides made by beamer. It
contains many equations, used many overlays are iterations, with few
figures, is a typical beamer slide. It also used mindmap very well.</p>
<p>There are at least 3 slides that are made by beamer today.</p>
<p>The study with title <pre><span class="code-line"></span>
<span class="code-line">Towards Improving Bug Tracking Systems with</span>
<span class="code-line">Game Mechanisms</span></pre>
presented with prezi. It have many pictures and many
transitions. But because of it is made by prezi, there are no headers
and footers so no page numbers and section titles etc. This is not
so convenient in such a official occasions because people need to
refer to the page number in question session.</p>
<p>There are at lease 6 presents used Apple Keynote. It is really
difficult to tell the difference between slides made by PowerPoint
and Keynote. 2 of them used the default theme of keynote.</p>
<p>The rest are using PowerPoint. Mrs Zhang from Microsoft used PowerPoint
but her slides looks like beamer very much such as the usage of footer
and header and overlays. If these are made by PowerPoint that will
involve many manually operations.</p>
<p>It is worth to mention that the slides of a study with title <pre><span class="code-line"></span>
<span class="code-line">Green</span>
<span class="code-line">Mining: A Methodology of Relating Software Change to Power Consumption</span></pre>
are all <pre><span class="code-line"></span>
<span class="code-line">badly</span></pre>
drawn hand paintings. The effect of these slide are
well received, they are green and clean and cute. You can refer to the
following animation for the effect but it is not exactly the same version
with what we saw :</p>
<p><a class="reference external" href="http://softwareprocess.es/a/greenmining-presentatation-at-queens-20120522.ogv">http://softwareprocess.es/a/greenmining-presentatation-at-queens-20120522.ogv</a></p>
</div>
<div class="section" id="microsoft-is-meaning">
<h4><a class="toc-backref" href="#id12">Microsoft is MEANING</a></h4>
<p>It is not a news. But Microsoft is the sponsor of Mining Challenge, and
the prize of this challenge will be Xbox and Kinect and the topic of
this year is:</p>
<pre class="literal-block">
Mining Android Bug
</pre>
<p>I see what you are doing there Microsoft ......</p>
</div>
</div>
</div>