Title: | Jane Austen's Complete Novels |
---|---|
Description: | Full texts for Jane Austen's 6 completed novels, ready for text analysis. These novels are "Sense and Sensibility", "Pride and Prejudice", "Mansfield Park", "Emma", "Northanger Abbey", and "Persuasion". |
Authors: | Julia Silge [aut, cre] |
Maintainer: | Julia Silge <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0.9000 |
Built: | 2024-11-17 04:39:37 UTC |
Source: | https://github.com/juliasilge/janeaustenr |
Returns a tidy data frame of Jane Austen's 6 completed, published novels with
two columns: text
, which contains the text of the novels divided into
elements of up to about 70 characters each, and book
, which contains the titles of
the novels as a factor in order of publication.
austen_books()
austen_books()
Users should be aware that there are some differences in usage between the novels as made available by Project Gutenberg. For example, "anything" vs. "any thing", "Mr" vs. "Mr.", and using underscores vs. all caps to indicate italics/emphasis.
A data frame with two columns: text
and book
library(dplyr) austen_books() %>% group_by(book) %>% summarise(total_lines = n())
library(dplyr) austen_books() %>% group_by(book) %>% summarise(total_lines = n())
A dataset containing the text of Jane Austen's 1815 novel "Emma". The UTF-8 plain text was sourced from Project Gutenberg and is divided into elements of up to about 70 characters each. (Some elements are blank.)
emma
emma
A character vector with 15297 elements
http://www.gutenberg.org/ebooks/158
A dataset containing the text of Jane Austen's 1814 novel "Mansfield Park". The UTF-8 plain text was sourced from Project Gutenberg and is divided into elements of up to about 70 characters each. (Some elements are blank.)
mansfieldpark
mansfieldpark
A character vector with 14768 elements
http://www.gutenberg.org/ebooks/141
A dataset containing the text of Jane Austen's novel "Northanger Abbey", published posthumously in 1818. The UTF-8 plain text was sourced from Project Gutenberg and is divided into elements of up to about 70 characters each. (Some elements are blank.)
northangerabbey
northangerabbey
A character vector with 7840 elements
http://www.gutenberg.org/ebooks/121
A dataset containing the text of Jane Austen's novel "Persuasion", published posthumously in 1818. The UTF-8 plain text was sourced from Project Gutenberg and is divided into elements of up to about 70 characters each. (Some elements are blank.)
persuasion
persuasion
A character vector with 8328 elements
http://www.gutenberg.org/ebooks/105
A dataset containing the text of Jane Austen's 1813 novel "Pride and Prejudice". The UTF-8 plain text was sourced from Project Gutenberg and is divided into elements of up to about 70 characters each. (Some elements are blank.)
prideprejudice
prideprejudice
A character vector with 12447 elements
http://www.gutenberg.org/ebooks/1342
A dataset containing the text of Jane Austen's 1811 novel "Sense and Sensibility". The UTF-8 plain text was sourced from Project Gutenberg and is divided into elements of up to about 70 characters each. (Some elements are blank.)
sensesensibility
sensesensibility
A character vector with 12262 elements
http://www.gutenberg.org/ebooks/161