data:image/s3,"s3://crabby-images/3fd1e/3fd1e1acfdaaca6a4bad63edcdd6ba843ce3e66b" alt=""
tidytext - Text Mining using 'dplyr', 'ggplot2', and Other Tidy Tools
Using tidy data principles can make many text mining tasks easier, more effective, and consistent with tools already in wide use. Much of the infrastructure needed for text mining with tidy data frames already exists in packages like 'dplyr', 'broom', 'tidyr', and 'ggplot2'. In this package, we provide functions and supporting data sets to allow conversion of text to and from tidy formats, and to switch seamlessly between tidy tools and existing text mining packages.
Last updated 11 months ago
natural-language-processingtext-miningtidy-datatidyverse
16.72 score 1.2k stars 60 dependents 17k scripts 34k downloadsdata:image/s3,"s3://crabby-images/fc7c1/fc7c188130de22f5d12acc0b472b4a585c22e722" alt=""
pins - Pin, Discover, and Share Resources
Publish data sets, models, and other R objects, making it easy to share them across projects and with your colleagues. You can pin objects to a variety of "boards", including local folders (to share on a networked drive or with 'DropBox'), 'Posit Connect', 'AWS S3', and more.
Last updated 20 days ago
azuregcloudrpinsrsconnects3storage
14.23 score 321 stars 17 dependents 1.9k scripts 4.8k downloadsdata:image/s3,"s3://crabby-images/97444/974446db7c409f0dc4946882fa3fe0129e3a4422" alt=""
butcher - Model Butcher
Provides a set of S3 generics to axe components of fitted model objects and help reduce the size of model objects saved to disk.
Last updated 24 days ago
11.33 score 132 stars 13 dependents 146 scripts 4.0k downloadswidyr - Widen, Process, then Re-Tidy Data
Encapsulates the pattern of untidying data into a wide matrix, performing some processing, then turning it back into a tidy form. This is useful for several operations such as co-occurrence counts, correlations, or clustering that are mathematically convenient on wide matrices.
Last updated 2 years ago
11.11 score 328 stars 2 dependents 1.7k scripts 1.8k downloadsjaneaustenr - Jane Austen's Complete Novels
Full texts for Jane Austen's 6 completed novels, ready for text analysis. These novels are "Sense and Sensibility", "Pride and Prejudice", "Mansfield Park", "Emma", "Northanger Abbey", and "Persuasion".
Last updated 2 years ago
jane-austennovelstext-mining
10.91 score 95 stars 61 dependents 1.1k scripts 27k downloadsdata:image/s3,"s3://crabby-images/3f0a4/3f0a46abde515ba717baaf30364e650b792bf052" alt=""
vetiver - Version, Share, Deploy, and Monitor Models
The goal of 'vetiver' is to provide fluent tooling to version, share, deploy, and monitor a trained model. Functions handle both recording and checking the model's input data prototype, and predicting from a remote API endpoint. The 'vetiver' package is extensible, with generics that can support many kinds of models.
Last updated 5 months ago
10.53 score 185 stars 1 dependents 466 scripts 1.3k downloadsdata:image/s3,"s3://crabby-images/16dd4/16dd47df1707861b835ef65834bb73d78ff1c321" alt=""
qualtRics - Download 'Qualtrics' Survey Data
Provides functions to access survey results directly into R using the 'Qualtrics' API. 'Qualtrics' <https://www.qualtrics.com/about/> is an online survey and data collection software platform. See <https://api.qualtrics.com/> for more information about the 'Qualtrics' API. This package is community-maintained and is not officially supported by 'Qualtrics'.
Last updated 5 months ago
apiqualtricsqualtrics-apisurveysurvey-data
10.18 score 221 stars 272 scripts 2.0k downloadsbundle - Serialize Model Objects with a Consistent Interface
Typically, models in 'R' exist in memory and can be saved via regular 'R' serialization. However, some models store information in locations that cannot be saved using 'R' serialization alone. The goal of 'bundle' is to provide a common interface to capture this information, situate it within a portable object, and restore it for use in new settings.
Last updated 3 months ago
7.90 score 29 stars 3 dependents 153 scripts 1.7k downloadsdata:image/s3,"s3://crabby-images/23c6e/23c6e2b31e7eeb2004239c538258c79265e0ae00" alt=""
tidylo - Weighted Tidy Log Odds Ratio
How can we measure how the usage or frequency of some feature, such as words, differs across some group or set, such as documents? One option is to use the log odds ratio, but the log odds ratio alone does not account for sampling variability; we haven't counted every feature the same number of times so how do we know which differences are meaningful? Enter the weighted log odds, which 'tidylo' provides an implementation for, using tidy data principles. In particular, here we use the method outlined in Monroe, Colaresi, and Quinn (2008) <doi:10.1093/pan/mpn018> to weight the log odds ratio by a prior. By default, the prior is estimated from the data itself, an empirical Bayes approach, but an uninformative prior is also available.
Last updated 3 years ago
empirical-bayeslog-odds-ratiotidy-datatidyverseweighted-log-odds
7.35 score 95 stars 157 scripts 326 downloadscereal - Serialize 'vctrs' Objects to 'JSON'
The 'vctrs' package provides a concept of vector prototype that can be especially useful when deploying models and code. Serialize these object prototypes to 'JSON' so they can be used to check and coerce data in production systems, and deserialize 'JSON' back to the correct object prototypes.
Last updated 2 years ago
4.91 score 25 stars 2 dependents 4 scripts 1.1k downloads