11  
Ernesto Soto Gómez  
Universidad de las Ciencias Informáticas  
RECIBIDO 03/12/2019 ACEPTADO 20/12/2019 PUBLICADO 30/03/2020  
ABSTRACT  
GitHub is a platform that provides hosting for software development version control using Git. It  
features an application programming interface to allow the software to interact with the platform.  
The enormous quantity of information Hosted in GitHub may be useful to make studies about the  
current presence of development tools in the open-source software development community.  
However, the search engine has restrictions that make it impossible to issue complex queries to  
the platform. In this report, it is described as an object-oriented and extensible solution, named  
QuantityEr, to obtain the number of search results of complex queries to GitHub by using the  
inclusion-exclusion principle. The mathematical definitions, as well as related concepts, are  
presented. The mathematical model is discussed. The application of general design and used  
development tools are presented. Also, the results of the execution examples are showed. It is  
concluded that the treated problem has been solved although more work may be done to improve  
the solution.  
Keywords: search results amount, GitHub, inclusion-exclusion principle, object-oriented  
programming, Python.  
RESUMEN  
GitHub es una plataforma que proporciona alojamiento para el control de versiones de desarrollo  
de software utilizando Git. Cuenta con una interfaz de programación de aplicaciones para permitir  
que el software interactúe con la plataforma. La enorme cantidad de información alojada en  
GitHub puede ser útil para realizar estudios sobre la presencia actual de herramientas de  
desarrollo en la comunidad de desarrollo de software de código abierto. Sin embargo, el motor  
de búsqueda posee restricciones que hacen imposible emitir consultas complejas a la plataforma.  
En este informe, se describe una solución extensible y orientada a objetos, llamada QuantityEr,  
para obtener la cantidad de resultados de búsqueda de consultas complejas a GitHub utilizando  
el principio de inclusión-exclusión. Se presentan las definiciones matemáticas y los conceptos  
relacionados. Se discute el modelo matemático. Se presentan el diseño general de la aplicación  
y las herramientas de desarrollo utilizadas. Además, son mostrados resultados de ejemplos de  
ejecución. Se concluye que el problema tratado ha sido resuelto, aunque se puede trabajar para  
mejorar la solución.  
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
12  
Palabras claves: cantidad de resultados de búsqueda, GitHub, principio de inclusión-exclusión,  
programación orientada a objetos, Python.  
INTRODUCTION  
1
2
GitHub is a platform that provides hosting for software development version control using Git .  
It provides several collaboration features such as bug tracking, feature requests, task  
management, and wikis for every project. It also features an application programming interface  
3
(
API) to allow software to interact with the platform [1]. Through this API a search engine can  
be accessed. The search engine allows users to find almost every single aspect across several  
4
projects, source codes and other areas and features of the platform [2]. A web page that serves  
5
as an interface to the search API is also available .  
As of August 2019, GitHub reports having over 40 million users and more than 100 million  
6
repositories . This enormous quantity of information may be useful, among other things, to obtain  
the number of projects, source codes, issues, etc, that mention a set of technologies, tools,  
development libraries, etc, in order to make studies about the current presence of these tools in  
the open source software development community. Other kind of quantitative studies may be  
done as well [3]. Examples of those kinds of research are [47].  
However, the search engine has some restrictions that make impossible to issue complex queries  
to the platform. According to the GitHub Developer Guide , the restrictions are the following:  
The Search API does not support queries that:  
are longer than 256 characters (not including operators or qualifiers).  
have more than five AND, OR, or NOT operators.  
For authenticated requests can be made up to 30 requests per minute. For unauthenticated  
requests, the rate limit allows making up to 10 requests per minute.  
Furthermore, if the search is over source code files, especial restrictions apply7.  
1
2
3
4
5
6
7
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
 
13  
A system named GHTorrent have been already developed to ease the interaction with the large  
8
quantity of information hosted in GitHub [8]. This solution is mainly conceived to mirror the data  
hosted in GitHub in order to facilitate parallel access and studies on snapshots of the data, but  
does not provide an alternative to making complex queries to GitHub. In fact, this system has its  
910  
own restrictions on the quantity of data that can be accessed at any time . Also, the system  
1112  
only provides snapshots for a reduced set of projects  
. Moreover, its design is centered only on  
the interaction with the repositories of GitHub. This means, for example, that search on source  
code is not allowed. Furthermore, the objective of the system is to interact with GitHub, which  
means that a future interaction with other platforms is not currently conceived.  
1
3
14  
A different kind of alternative is GH Archive which records events form GitHub . The recorded  
15  
data can be accessed through BigQuery which allows any kind of SQL-like queries. GH Archive,  
although a powerful and flexible solution, does not constitute an alternative to explore the data  
stored in GitHub but a tool to explore the data that represents the interaction with GitHub. This  
means that, for example, searching inside public source code cannot be done with GH Archive.  
Moreover, both of these systems are server like development tools and not client applications  
ready to use for making queries.  
In the context of this article, complex queries are those that have many logical connectives and  
sub-expressions for example: A OR (C AND (D OR E))especially those that exceed the allowed  
number of logical operators. By getting the results number of queries of this kind, analysis of the  
current presence of technologies might be done. Although many reporting tools has been  
developed none of them are capable of getting the results number of complex queries directly to  
GitHub. Some of these tools are listed in https://www.gharchive.org/. Another example not listed  
in previous URL is https://www.programcreek.com/. In that case the reports are just for  
statically-selected libraries from statically-selected languages.  
16  
In this report, it is described a simple solution, named QuantityEr , to obtain the search results  
number of complex queries directly to GitHub. The proposed design was conceived with the aim  
of extension in mind, in such a way that it would be possible to incorporate the ability to interact  
with other similar platforms besides GitHub as well as other queries languages and algorithms for  
obtaining the amount of search results.  
8
9
1
1
1
1
1
1
1
0
1
2
3
4
5
6
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
14  
The current document is structured in the following manner. Section exposes some mathematical  
definitions and concepts necessary to understand the proposed solution. Section describes the  
proposed solution as well as some usage examples. Section makes the final remarks and  
conclude.  
MATHEMATICAL BACKGROUND  
In order to understand the proposed solution, some mathematical background is necessary. To  
archive a self- contained report, in this section is mentioned the principal mathematical concepts  
used in the design of the solution. The following definitions (or equivalent ones) as well of other  
complementary concepts and profs can be found in the cited references [917].  
The following notations will be used in this report.  
(A) denotes the power set of a set A, that is the set of all subsets of A.  
|A| denotes the cardinality of a set A, that is the number of elements in A.  
 denotes the empty set.  
Boolean algebras  
The first essential concept important to the design of the proposed solution is that of Boolean  
algebra.  
Definition 1. A Boolean algebra is a tuple (푆, +,·, 퐼, ⊥, 푇) where  is a set containing distinct  
elements  and , + and · are binary operators on  and  is a unary operator on . Every  
Boolean algebra satisfies the following laws for all 푥, 푦, 푧 ∈ 푆.  
Commutative laws:  + 푦 = 푦 + 푥  
푥 · 푦 = 푦 · 푥  
Distributive laws:  
Identity laws:  
푥 · (푦 + 푧) = (푥 · 푦) + (푥 · 푧)  
푥 + (푦 · 푧) = (푥 + 푦) · (푥 + 푧)  
푥 · 푇 = 푥  
푥 + ⊥ = 푥  
Complement laws:  
푥 + 푥 = 푇  
푥 · 푥 = ⊥  
Associative and idempotent laws, as well as other laws can be also considered since they follow  
from the definition laws. Furthermore, other useful operators can be derived from the previous  
ones [12, 14, 16].  
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
15  
Fact 1. In a Boolean algebra (푆, +,·, 퐼, ⊥, 푇) the following laws are satisfied for all 푥, 푦, 푧 ∈ 푆:  
Associative laws:  
푥 + (푦 + 푧) = (푥 + 푦) + 푧  
푥 + 푥 = 푥  
푥 · (푦 · 푧) = (푥 · 푦) · 푧  
푥 · 푥 = 푥  
Idempotent laws:  
Boolean algebras are used to model operations over the elements of a set that relates two  
elements with the maximum (+ operation) or the minimum (· operation) of both elements in a  
partial order where the minimum and the maximum are  and T, respectively. In other words, a  
partial order  can be defined over S where  
푎, 푏 ∈ 푆 (푎 ≤ 푏 ⇔ 푎 + 푏 = 푏)  
or equivalently  
and  
푎, 푏 ∈ 푆 (푎 ≤ 푏 ⇔ 푎 ·  = 푎)  
푎 ∈ 푆 (⊥ ≤ 푎 ≤ 푇)  
[14].  
Also, intuitively speaking, all the elements have an associated complement counterpart that  
together form the maximum but apart from the minimum as stated in the complement laws.  
Fact 2. The tuple ({0, 1},∨,∧, ¬, 0, 1) is a Boolean algebra with the operations of disjunction (∨),  
conjunction (∧) and negation (¬) defined as follow.  
0
0
0
1
1
1
1
0
0
0
0
1
0
1
x
0
1
¬x  
1
1
1
0
This is the most elemental Boolean algebra and is the one found in classical binary logic that has  
applications in several areas of computer sciences [10, 1214].  
Fact 3. The tuple (℘(푈),∪,∩, 퐶, ∅, 푈) is a Boolean algebra with the operation of union (∪) ,  
intersection (∩) and complement (퐶) defined as follows for all 푋, 푌 ∈ ℘(푈).  
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
16  
=
{
|
=
{
|
푌}  
 = {  
 |   푋}  
This specific Boolean algebra is of great interest in science since mathematics in general are  
founded in set theory [1114].  
In this specific work, the last two described Boolean algebras are crucial because the current  
problem is to find the number of objects that makes true a logical sentence. In this context, the  
logical sentence is the query to be issue to the platform. The proposed solution takes advantage  
of the equivalences between classical logic and set theory in the context of Boolean algebras to  
solve this problem.  
Boolean functions  
In some contexts, the combination of operations in the set {0, 1} are called Boolean functions. The  
following definitions relate to this subject.  
Definition 2. A Boolean function of degree  is a function 푓: {0, 1} → {0, 1} where  is an atom (a  
single variable or value) or a composition of the operations ,  and ¬ of the Boolean algebra  
(
{0, 1},∨,∧, ¬, 0, 1). This composition is called a Boolean expression, and the variables of the Boolean  
expression are called Boolean variables.  
This concept has wide application in logic gates circuits design. In this topic one of the main  
problems is the simplification of Boolean expressions [9, 12, 14, 16].  
In the case of this work, these are of great importance because, as we will see, each query has  
an associated Boolean expression. The objective is to simplify it in order to obtain an expression  
that involves less computation.  
The simplification of a Boolean expressions may be done symbolically by applying the laws of a  
Boolean algebra (definition 1) but also by applying specific methods that simplify an equivalent  
form of the expression.  
Definition 3. Two Boolean expressions 퐴(푥 , 푥 , . . . , 푥 ) and 퐵(푥 , 푥 , . . . , 푥 ) are equivalent if  
2
2
푥 , 푥 , . . . , 푥 ∈ {0, 1} (퐴(푥 , 푥 , . . . , 푥 ) = 퐵(푥 , 푥 , . . . , 푥 ))  
ꢃ 2 ꢂ ꢃ 2 ꢂ ꢃ 2 ꢂ  
Definition 4. A normal form of a Boolean expression 푓(푥 , 푥 , . . . , 푥 ) is an equivalent Boolean  
2
expression in the form 푔(푥 , 푥 , . . . , 푥 ) = t ∗ t ∗, . . . ,∗ t where each t ≤ 푖 ≤ ꢄ is in the form y ∗  
2
2
y2 ∗, . . . ,∗ y and each y ≤ 푗 ≤ ꢆ is in the form  or ¬푥 where 1 ≤ ꢆ ≤ 푛.  
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
17  
When  is  and  is  the normal form is called conjunctive (CNF). Similarly, when  is  and  is  
the normal form is called disjunctive (DNF). Additionally, when the normal form is conjunctive  
each t ≤ 푖 ≤ ꢄ is called a maxterm. Similarly, when the normal form is disjunctive each t ≤ 푖 ≤ ꢄ  
is called a minterm.  
The Quine-McCluskey algorithm is one of such methods that uses the normal form of a Boolean  
expression, specifically DNF, to obtain an equivalent minimal expression. The algorithm, in  
essence, test combinations of the minterms in order to find those that are essential to represent  
the value of the expression. It is known that it does not performance well when the size of the  
input, in this case the expression to simplify, is big. In fact, the problem of simplification of  
Boolean expressions is considered NP-hard [12, 14, 16].  
However, the simplification of a Boolean expression is steel of great importance to this work,  
because small queries are preferable to big ones.  
Definition 5. Let  , 푋 , . . . , 푋 be given sets. A predicate is a function  ∶ 푋 × 푋 ×, . . . , 푋 → {0, 1}  
2
2
[10, 13].  
It obvious that a predicate has an associated Boolean expression if each atom is replaced by a  
Boolean variable.  
Definition 6. The expression  = {푥 | 푃 (푥)} is equivalent to  ∈ 푆 ⟺ 푃 (푥) [11].  
The following theorem will be useful in the modeling of the solution.  
Theorem 1. The following relations are satisfied for any  = {푥 | 푃 (푥)} and  = {푥 | 푄(푥)}:  
(
(
(
a)  ∪ 퐵 = {푥 | 푃 (푥) ∨ 푄(푥)}  
b)  ∩ 퐵 = {푥 | 푃 (푥) ∧ 푄(푥)}  
c)  = {푥 | ¬푃 (푥)}  
Demonstration. Proof follows directly from fact 3 and definition 6.  
This relations may be easily understood, since if  contains all the elements  such that  (푥) = 1  
and  is all the elements  such that 푄(푥) = 1 then it follows from the definition 6 and the  
definition of union in the fact 3 that  ∪ 퐵 will have the elements  such that  (푥) ∨ 푄(푥) = 1.  
The same analysis can be done for the intersection and complement cases.  
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
18  
Inclusion-exclusion principle  
First let consider the cardinality of the power set. This will be useful later in the description of the  
proposed solution.  
Fact 4. The cardinality of the power set of  is  
(푈 ) = ꢇ||  
[13].  
The inclusion-exclusion principle (IEP) is a mathematical formula that can be used to obtain the  
cardinality of the union of finite sets taking into account the cardinality of all possible intersections  
of the given sets.  
Fact 5 (Inclusion-exclusion principle). The cardinality of the union of sets  ∈ {,2,...,} is  
ꢊ⋃ 푆 ꢊ =  
(−1)|  
퐽|ꢌꢃ  
ꢍ⋂ 퐴ꢍ  
ꢉꢋꢃ  
∅≠퐽∈{ꢃ,2,…ꢂ}  
ꢎ∈퐽  
The number of every possible intersection of n sets is the same that the number of subsets of a  
set of n elements without counting the empty set. This leads to the following fact taking into  
account fact 4.  
Fact 6. There are  
  1  
terms in the inclusion-exclusion principle formula for  sets.  
This means that an algorithm that calculates the cardinality of the union of  sets by directly  
using the IEP have an exponential complexity [15, 17].  
In the proposed solution the IEP is used to decompose a given query in many smaller sub-queries  
that will be issued to the platform search API. In the next section, will be shown how to manage  
the problem of the exponential complexity when using this method.  
RESULTS AND DISCUSSION  
The problem to solve is: How to get the results number of complex queries to GitHub?  
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
19  
The proposed solution follows a divide and conquer approach as follows:  
1.  
2.  
3.  
Simplify and decompose complex queries into smaller simple sub-queries.  
Issue the sub-queries to the server and obtain the results amount of each one.  
Sum up the results of the sub-queries into one that constitutes the results amount of  
the initial complex query.  
In the next subsection a mathematical model and formalization of the solution is given.  
Mathematical model  
Mathematically speaking, the problem to solve is as follows.  
Let  be the set of all the objects in the platform (projects, source codes, etc). Let  ∶ 푂 → {0, 1}  
be a predicate that represents the query to issue. Then, the set  of all objects that match the  
query  is  
 = {표 | 푄(표)}  
The problem to solve is finding  when the associated Boolean expression given by  has many  
compositions and logical connectives.  
The first step of the proposed solution is to simplify the Boolean expression associated to the  
query. This may be done by symbolic transformations applying the laws that a Boolean algebra  
satisfies or also by using the Quine-McCluskey algorithm. It is known that this solution is not  
effective when the size of the input is too big. For this reason, the resultant expression (simplified  
or not) must be decomposed into various sub-expressions. For this purpose, the DNF expression  
is used. By applying theorem 1 it is known that if 푄(표) = 푄 (표) ∨ 푄 (표) ∨ . . .∨ 푄 (표) is the DNF then  
2
푟 = {표 | 푄 (표) ∨ 푄 (표) ∨ . . .∨ 푄 (표)} = 푟 ∪ 푟  . . . 푟ꢒ  
2
where  
 = {o | Q (o)}  
for each 1 ≤ 푖 ≤ 푛.  
Each Q (o) is in the form  (표) ∧ 푄 (표) ∧ . . .∧ 푄 (표). This kind of query can be issued directly to  
GitHub because it does not have composition and only have conjunctive connectives. The  
conjunctives connectives (AND in the query language of GitHub) can be stripped of the sub-query  
since GitHub automatically interprets a tuple of atoms as a conjunction. In this case the is no use  
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
20  
of conjunctive or disjunctive connectives in the query. Nevertheless, the case of the negation is  
a problem that, for now, cannot be avoided. So, in this case, a query must be designed with care  
in order not to exceed the restriction that GitHub Search API imposes in the number of operators.  
After the sub-queries have been sent, the next step is to find the results amount of the main  
query by applying IEP (fact 5). The problem with this approach is that the number of terms –  
according to fact 6 in IEP formula with  sets is  − 1, which is the number of sub-queries to  
be issued to the server.  
However, each term in IEP is of the form of an intersection. Moreover, the terms in the expression  
associated to the DNF are also in the form of intersection. Then, by applying fact 1, that it is  
possible to reduce each term of the IEP formula so that some terms might be repeated afterwards.  
For this reason, it is proposed to use a cache for storing already issued queries as well as its  
respective results quantities in order to reduce the number of issued queries. However, work still  
need to be done to accelerate the computations of the terms in the IEP formula.  
Solution design  
QuantityEr is designed by using the object-oriented paradigm. Care on extension has been taken  
from the beginning by assigning a class to each sub-process in the solution. In Figure 1 is outlined  
the class diagram of the most important classes. The classes are given as abstract base classes,  
so they must be extended for a particular problem. Currently, the extensions for solving the  
problem in the specific case of GitHub are implemented. Next, it is briefly described each class.  
Main: Coordinate the interaction between the Input, Engine and Output classes objects. That is,  
the main algorithm is implemented inside this class.  
Input: Currently, the queries can be presented to QuantityEr from two sources: the command  
line and files. Several queries can be presented to the application in one single execution. The  
responsibility of this class is to present these sources as a stream to the Parser. Since the logic  
of the input is encapsulated in one class, other kind of inputs may be added in the future like, for  
example, inputs from the network.  
Parser: Translate the queries presented as input to a standard language that can be managed  
by the other entities. Since the logic of parsing is encapsulated in one class the syntax of the  
language used in the input queries do not need to be like the one expected by GitHub. This may  
ease the input allowing a cleaner syntax.  
MiddleCode: Represents the intermediate language that the other classes understand. All the  
queries inside the application are in this format.  
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
21  
Engine: Coordinate the interaction between the Decomposer, Cache, Translator and QueryIssuer  
classes objects. That is, the algorithm that give the solution to the problem is implemented inside  
this class.  
Decomposer: Decompose a complex query into several smaller simple queries. Currently, the  
extension using IEP is implemented.  
Cache: Store the results amounts of already issued queries. Currently, an in-memory cache is  
available as well as a file-based one.  
Translator: Translate a given simple sub-query to an issuable one. Currently, only GitHub is  
supported but more platforms may be added in the future.  
QueryIssuer: Emit a simple sub-query to the platform and obtain the results amount or inform  
of an error if it was the case.  
Figure 1. Class diagram of main classes of QuantityEr  
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
22  
Execution example results  
In this section we consider a usage example result in order to study the behavior of the application  
with complex queries.  
In this case, the queries ask for the amount of source codes that use the classical synchronization  
mechanisms defined in the asyncio, multiprocessing and threading Python libraries.  
The results are summarized in table 1 and figure 2.  
The command lines options to the program, the actual output, the presented queries as well as  
17  
other execution example can be found in attached document examples.html .  
Figure 2. Sub-queries amount. Total vs Issued  
In table 1 and figure 2 can be seen that the number of sub-queries depend on the ability of the  
18 19  
20 21  
[18] Sympy [19] library to simplify the given expression. Also, in this case, the  
Python’s  
1
1
1
2
2
7
8
9
0
1
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
23  
presence of the cache effects a great reduction on the number of issued queries, especially when  
the number of sub-queries is big.  
Table 1. Execution example results summary.  
#
means quantity. % means percent.  
Sub-queries  
Cached  
Issued  
Queries libraries  
No.  
Amount Total  
%
%
01  
02  
03  
asyncio  
multiprocessing  
threading  
69 053  
159 515  
1 451 344 31  
15  
31  
0
0
0
0
0
0
15  
31  
31  
100  
100  
100  
asyncio ∩  
multiprocessing  
asyncio ∩  
threading  
04  
05  
06  
3 095  
16 383 16 353 99.82 30  
16 383 16 353 99.82 30  
0.18  
0.18  
100  
29 658  
124 327  
multiprocessing  
31  
0
0
31  
threading  
asyncio ∩  
multiprocessing ∩  
0
7
8
threading  
1 947  
16 383 16 353 99.82 30  
0.18  
asyncio ∪  
multiprocessing  
asyncio ∪  
threading  
0
228 130  
511  
435  
435  
85.13 76  
85.13 76  
90.91 93  
14.87  
0
9
0
1 494 420 511  
14.87  
9.09  
multiprocessing  
1
1 489 850 1 023 930  
threading  
asyncio ∪  
multiprocessing ∪  
threading  
11  
1 528 155 16 383 16 185 98.79 198 1.21  
CONCLUSIONS  
In this report a tool, named QuantityEr, to obtain the results number of complex queries to GitHub  
search API has been described. The application uses the inclusion-exclusion principle and other  
mathematical abstractions to decompose the query in several simple sub-queries. The application  
uses a cache in order to reduce the number of sub-queries issued to the server. Even though it  
is considered that the use of the cache improves the solution and makes it viable, more work  
may to be done in order to accelerate the computations of the IEP formula terms. Moreover, the  
application may be extended to resolve other restrictions problems in GitHub and other platforms.  
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
24  
REFERENCES  
1] C. Dawson and B. Straub, Building Tools with GitHub: Customize Your Workflow, 1st ed.  
[
O’Reilly Media, Inc., 2016.  
[2] ——, “Python and the Search API,” in Building Tools with GitHub: Customize Your Workflow,  
1
st ed. O’Reilly Media, Inc., 2016, pp. 5380.  
[
3] S. Amann, S. Beyer, K. Kevic, and H. Gall, “Software Mining Studies: Goals, Approaches,  
Artifacts, and Replicability,” in Software Engineering: International Summer Schools, LASER  
013-2014, Elba, Italy, Revised Tutorial Lectures, ser. Lecture Notes in Computer Science, B.  
2
Meyer and M. Nordio, Eds. Cham: Springer International Publishing, 2015, pp. 121158. [Online].  
[4] M. Beller, R. Bholanath, S. McIntosh, and A. Zaidman, “Analyzing the State of Static Analysis:  
A Large- Scale Evaluation in Open Source Software,” in 2016 IEEE 23rd International Conference  
on Software Analysis, Evolution, and Reengineering (SANER), vol. 1, Mar. 2016, pp. 470481.  
[5] Y. Zhang, G. Yin, Y. Yu, and H. Wang, “Investigating Social Media in GitHub’s Pull-requests:  
A Case Study on Ruby on Rails,” in Proceedings of the 1st International Workshop on Crowd-  
based Software Development Methods and Technologies, ser. CrowdSoft 2014. New York, NY,  
USA: ACM, 2014, pp. 3741, event-place: Hong Kong, China. [Online]. Available:  
[6] ——, “A Exploratory Study of @-Mention in GitHub’s Pull-Requests,” in 2014 21st Asia-Pacific  
Software Engineering Conference, vol. 1, Dec. 2014, pp. 343350.  
[
7] A. A. Sawant and A. Bacchelli, “fine-GRAPE: fine-grained APi usage extractor  an approach  
and dataset to investigate API usage,” Empirical Software Engineering, vol. 22, no. 3, pp. 1348–  
371, Jun. 2017. [Online]. Available: https://doi.org/10.1007/s10664-016-9444-6  
1
[8] G. Gousios, B. Vasilescu, A. Serebrenik, and A. Zaidman, “Lean GHTorrent: GitHub Data on  
Demand,” in Proceedings of the 11th Working Conference on Mining Software Repositories, ser.  
MSR 2014. New York, NY, USA: ACM, 2014, pp. 384387, event-place: Hyderabad, India.  
[9] J. W. Grossman, “Functions,” in Handbook of Discrete and Combinatorial Mathematics, 2nd  
ed., K. H. Rosen, Ed. Chapman & Hall/CRC, 2018, pp. 3242.  
[10] ——, “Propositional and Predicate Logic,” in Handbook of Discrete and Combinatorial  
Mathematics, 2nd ed., K. H. Rosen, Ed. Chapman & Hall/CRC, 2018, pp. 1222.  
[11] ——, “Set Theory,” in Handbook of Discrete and Combinatorial Mathematics, 2nd ed., K. H.  
Rosen, Ed. Chapman & Hall/CRC, 2018, pp. 2232.  
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935  
25  
[12] R. Johnsonbaugh, “Boolean Algebras and Combinatorial Circuits,” in Discrete Mathematics,  
8
th ed. New York, NY: Pearson, 2017, pp. 532567.  
[13] ——, “Sets and logic, in Discrete Mathematics, 8th ed. New York, NY: Pearson, 2017, pp.  
161.  
[14] J. G. Michaels, “Boolean Algebras,” in Handbook of Discrete and Combinatorial Mathematics,  
2
nd ed., K. H. Rosen, Ed. Chapman & Hall/CRC, 2018, pp. 269379.  
[15] R. G. Rieper, “Inclusion/Exclusion,” in Handbook of Discrete and Combinatorial Mathematics,  
2
nd ed., K. H. Rosen, Ed. Chapman & Hall/CRC, 2018, pp. 110116.  
[16] K. H. Rosen, “Boolean Algebra,” in Discrete Mathematics and Its Applications, 8th ed.  
New York, NY: McGraw-Hill, 2019, pp. 847883.  
[17] ——, “Inclusion–Exclusion,” in Discrete Mathematics and Its Applications, 8th ed. New  
York, NY: McGraw-Hill, 2019, pp. 579585.  
[18] S. Kapil, Clean Python: Elegant Coding in Python. Apress, 2019.  
[19] J. M. Stewart, “SymPy: A Computer Algebra System,” in Python for Scientists, 2nd ed. New  
York, NY: Cambridge University Press, 2017, pp. 128149.  
REVISTA INNOVACIÓN Y SOFTWARE  
VOL 1 Nº 1 Marzo - Agosto 2020 ISSN Nº 2708-0935