This release brings some substantial improvements with making it possible to scan whole organizations and particular repositories for one host at the same time, boosting function to prepare commits statistics and simplifying workflow for getting files.
orgs
and
repos
in set_*_host()
functions (#400).get_commits_stats()
function (#556, #557)
with:
group_var
parameter,time_interval
parameter to
time_aggregation
,yearly
aggregation to
time_aggregation
parameter,GitStats
to
commits_data
object which allows to build workflow in one
pipeline
(create_gitstats() |> set_*_host() |> get_commits() |> get_commits_stats()
).get_files_content()
and
get_files_structure()
into one get_files()
(#564)..error
parameter to the set_*_host()
functions to control if error should pop up when wrong input is passed
(#547).author_name
and
author_login
if it was missing in
commits_table
(#550).GraphQL
response error when pulling
repositories with R error. Earlier, GitStats
just returned
empty table with no clue on what has happened, as errors from
GraphQL
are returned as list outputs (they do not break
code).orgs
parameter in set_github_host()
(#562).This is a patch release which introduces some hot fixes and new data
in get_commits()
output.
repo_url
column to output of
get_commits()
function (#535).verbose
mode is set
to FALSE
(#525) and
fixed checking token scopes for GitLab (#526).get_repos_urls()
output when individual
repositories are set in set_*_host()
(#529).
Earlier the function pulled all repositories for an organization, even
though, repositories were defined for the host, not whole organizations.
This is similar to the solved earlier (#439).This is a patch release which introduces some improvements in
get_R_package_usage()
on speed and possibility to pull at
once data on multiple R packages, new get_storage()
function and some fixes for checking token scopes and setting hosts.
get_R_package_usage()
function:
packages
parameter replacing old package_name
)
(#494),split_output
parameter has been added - when set to
TRUE
a list
with tibbles
(every
element of the list
for every package) instead of one
tibble
is returned.get_repos()
(#492).
Earlier this was only possible for GitHub organizations and GitLab
groups.get_storage()
function to retrieve data from
GitStats
object - whole or particular datasets
(e.g. commits
, repositories
or
R_package_usage
) (#509).GitHost
is not passed to GitStats
. This also applies to situation
when GitStats
looks for default tokens (not defined by
user). Earlier, if tests for token failed, an empty token was passed and
GitStats
was created, which was misleading for the
user.github.com
or https://github.com
) to
set_github_host()
(#475).{host_url}
, http://{host_url}
or
https://{host_url}
) to host
parameter in
`set_*_host() function (#399).This minor release comes up with new
get_files_structure()
function and adjustments to
get_files_content()
so user can pull custom (by defining
pattern of files and depth of directories) files tree from repository
and pull their content.
get_files_structure()
function to pull files
structure for a given repository with possibility to control level of
directories (depth
parameter) and to limit output to files
matching regex argument passed to pattern
parameter (#338).
Together with that, get_files()
function was renamed to
get_files_content()
to better reflect its purpose.get_files_content()
so it can make use of
files_structure
pulled to GitStats
storage
with get_files_structure()
function - if
file_path
is set to NULL
and
use_files_structure()
parameter to TRUE
(both
are by default)(#467).progress
parameter to user functions to control
showing of cli
progress bar separately from messages (which
are controlled with verbose
) (#465).orgs
nor repos
specified) from warning to
info (#456).gh-pages
, lint and
check for bumping version.This is a patch release with substantial improvements to some
functions (get_repos()
, get_files()
and
get_R_package_usage()
), adding with_files
and
in_files
parameters, fixing cache
feature and
introducing new get_repos_urls()
function, a minimalist
version of get_repos()
:
get_repos_urls()
function to fetch repository
URLs (either web or API - choose with type
parameter). It
may return also only these repository URLs that consist of a given file
or files (with passing argument to with_files
parameter) or
a text in code blobs (with_code
parameter). This is a
minimalist version of get_repos()
, which takes out all the
process of parsing (search response into repositories one) and adding
statistics on repositories. This makes it poorer with content but
faster. (#425).with_files
parameter to get_repos()
function, which makes it possible to search for repositories with a
given file or files and return full output for repositories.with_code
parameter (as a character vector) in
get_repos()
and get_repos_urls()
(282).in_files
parameter to get_repos()
which works with with_code
parameter. When both are
defined, GitStats
searches code blobs only in given
files.dplyr::glimpse()
from get_*()
functions, so there is printing to console only if get_*()
function is not assigned to the object (#426).get_R_package_usage()
consists now also
of repository full name (#438).get_R_package_usage()
with optimizing search
of package names in DESCRIPTION
and NAMESPACE
files by removing filtering method and replacing it with
filename:
filter directly in search endpoint query (#428).get_files()
when scanning scope is set to
repositories
. Earlier, it pulled given files from whole
organizations, even if scanning scope was set to repos
with
set_*_host()
. Now it shows only files for the given
repositories (#439).verbose
parameter controls now showing of the progress
bars (#453).This is a patch release with some hot issues that needed to be
addressed, notably covering set_*_host()
functions with
verbose
control, tweaking a bit verbose
feature in general, fixing pulling data for GitLab subgroups and
speeding up get_files()
function.
GitStats
is
set to scan whole hosts, with switching to Search API
instead of pulling files via GraphQL
(with iteration over
organizations and repositories) (#411).orgs
or repos
) GitStats does not pull no more
all organizations. Pulling all organizations from host is triggered only
when user decides to pull repositories from organizations. If he
decides, e.g. to pull repositories by code, there is no need to pull all
organizations (which may be a time consuming process), as GitStats uses
then Search API
(#393).set_*_host()
functions with verbose_off()
or
verbose
parameter (#413).verbose
to FALSE
does not lead to
hiding output of the get_*()
functions - i.e. a glimpse of
table will always appear after pulling data, even if the
verbose
is switched off. verbose
parameter
serves now only the purpose to show and hide messages to user (#423).set_*_host()
function (#415)This is a major release with general changes in workflow (simplifying
it), changes in setting GitStats
hosts, deprecation of some
not very useful features (like plots, setting parameters separately) and
new get_release_logs()
function.
set_host()
function is replaced with more explicit
set_github_host()
and set_gitlab_host()
(#373). If
you wish to connect to public host (e.g. api.github.com
),
you do not need to pass argument to host
parameter.repositories
, commits
,
R_package_usage
or other you should use directly
corresponding get_*()
functions instead of
pull_*()
which are deprecated. These get_*()
functions pull data from API, parse it into table, add some goodies
(additional columns) if needed and return table instead of
GitStats
object, which in our opinion is more intuitive and
user-friendly (#345).
That means you do not need to run in pipe two or three additional
function calls as before,
e.g. pull_repos(gitstats_object) %>% get_repos() %>% get_repos_stats()
,
but you just run get_repos(gitstats_object)
to get data you
need.get_*()
function GitStats
will pull the data from its storage and
not from API as for the first time, unless you change parameters for the
function (e.g. starting date with since
in
get_commits()
) or change directly the cache
parameter in the function. (#333)pull_repos_contributors()
as a separate function is
deprecated. The parameter add_contributors
is now set by
default to TRUE
in get_repos()
which seems
more reasonable as user gets all the data.get_commits()
old parameters (date_from
and date_until
) were replaced with new, more concise
(since
and until
).set_params()
function is removed. (#386) Now
the logic is moved straight to get_*()
functions. For
example, if you want to pull repositories with specific
code blob
, you do not need to define anything with
set_params()
(as previously with search_mode
and phrase
parameter) but you just simply run
get_repos(with_code = 'your_code')
. (#333)verbose
have been introduced for
limiting messages to user when pulling data - this parameter can be set
in all get_*()
functions. You can also turn the verbose
mode on/off globally with
verbose_on()
/verbose_off()
functions.get_repos_stats()
function was deprecated as its role
was unclear - unlike get_commit_stats()
it did not
aggregate repositories data into new stats table, but added only some
new numeric columns, like number of contributors
(contributors_n
) or last activity in difftime
format, which is now done within get_repos()
function.team
and filtering by language
is no longer supported - these features where quite heavy for the
package performance and did not bring much added value. If user needs,
he can always filter the output (formatted responses pulled from API) by
contributors or language. (#384)GitStats
, they
have been deprecated as the package is meant to be basically for back
end purposes and this is the field where developer’s effort should now
go (#381). If
needed and requested, plot functions may be brought up once more in next
releases.get_release_logs()
(#356).get_orgs()
is renamed to show_orgs()
to
reflect that it does not pull data from API, but only shows what is in
GitStats
object.author_login
and author_name
(#332).
This is due to the mix of GitHub/GitLab handles and display names in the
author
column (the original author name
field
in commits API response).GitStats
object - now when you return
GitStats
object in console, it prints GitStats
data divided into sections to give more readable information to user:
scanning scope
(organizations and repositories), and
storage
(the output tables stored in GitStats
with basic information on dimensions) (#329).contributors
response (#331).gts_to_posixt()
helper which took
dependencies on stringr
was a cause for some users of
passing empty value to since
parameter to commits endpoint
which ended in Bad Request Error (400) and infinite loop of retrying the
response (#360).pull_R_package_usage()
with
get_R_package_usage()
functions to pull repositories where
package name is found in DESCRIPTION or NAMESPACE files or code blobs
with phrases related to using an R package
(library(package)
, require(package)
) (#326, #341),pull_files()
with get_files()
to pull
content of text files (#200).GitStats
with set_host()
function by using repos
parameter instead of orgs
(#330).id
to
repo_id
and name
to
repo_name
,default_branch
column to repositories output as
a consequence of #200.get_*_stats()
functions to prepare summary stats
from pulled data: repositories and commits (#276),gitstats_plot()
which takes as an input
repos_stats
or commits_stats
class objects (#276),get_*
to pull_*
;
get_*
functions are now to retrieve already pulled data
from GitStats object (#294),setup()
to set_params()
(#294),set_connection()
to
set_host()
(#271),add_team_member()
to
set_team_member()
(#271).GITHUB_PAT
or
GITLAB_PAT
), there is no need to pass them as an argument
to set_host()
(#120),pull_users()
function to pull information on
users (#199),orgs
are passed (#258),get_orgs()
function to print all organizations
(#283),reset()
function (#270)reset_language()
or setting language
parameter
to All
in setup()
function (#231)contributors
as basic stat when pulling
repos
by org
and by phrase
to
improve speed of pulling repositories data. Added
pull_repos_contributors()
user function and
add_contributors
parameter to pull_repos()
function to add conditionally information on contributors to
repositories table (#235)api_url
column as an address
to the repository, not the host (#201),%>%
) (#289).This is the first release of GitStats with given features:
create_gitstats()
- creating GitStats object,set_connection()
- adding hosts to GitStats
object,setup()
- setting search parameter to org, team or
phrase, setting programming language of repositories,get_repos()
- pulling repositories from GitHub and
GitLab API in a standardized table,get_commits()
- pulling commits from GitHub and GitLab
API in a standardized table,set_team_member()
- adding team members to GitStats
object.