Zum Hauptinhalt springen Zur Suche springen Zur Hauptnavigation springen
Beschreibung

A practical guide to data-intensive humanities research using the Python programming language

The use of quantitative methods in the humanities and related social sciences has increased considerably in recent years, allowing researchers to discover patterns in a vast range of source materials. Despite this growth, there are few resources addressed to students and scholars who wish to take advantage of these powerful tools. Humanities Data Analysis offers the first intermediate-level guide to quantitative data analysis for humanities students and scholars using the Python programming language. This practical textbook, which assumes a basic knowledge of Python, teaches readers the necessary skills for conducting humanities research in the rapidly developing digital environment.

The book begins with an overview of the place of data science in the humanities, and proceeds to cover data carpentry: the essential techniques for gathering, cleaning, representing, and transforming textual and tabular data. Then, drawing from real-world, publicly available data sets that cover a variety of scholarly domains, the book delves into detailed case studies. Focusing on textual data analysis, the authors explore such diverse topics as network analysis, genre theory, onomastics, literacy, author attribution, mapping, stylometry, topic modeling, and time series analysis. Exercises and resources for further reading are provided at the end of each chapter.

An ideal resource for humanities students and scholars aiming to take their Python skills to the next level, Humanities Data Analysis illustrates the benefits that quantitative methods can bring to complex research questions.

  • Appropriate for advanced undergraduates, graduate students, and scholars with a basic knowledge of Python
  • Applicable to many humanities disciplines, including history, literature, and sociology
  • Offers real-world case studies using publicly available data sets
  • Provides exercises at the end of each chapter for students to test acquired skills
  • Emphasizes visual storytelling via data visualizations

A practical guide to data-intensive humanities research using the Python programming language

The use of quantitative methods in the humanities and related social sciences has increased considerably in recent years, allowing researchers to discover patterns in a vast range of source materials. Despite this growth, there are few resources addressed to students and scholars who wish to take advantage of these powerful tools. Humanities Data Analysis offers the first intermediate-level guide to quantitative data analysis for humanities students and scholars using the Python programming language. This practical textbook, which assumes a basic knowledge of Python, teaches readers the necessary skills for conducting humanities research in the rapidly developing digital environment.

The book begins with an overview of the place of data science in the humanities, and proceeds to cover data carpentry: the essential techniques for gathering, cleaning, representing, and transforming textual and tabular data. Then, drawing from real-world, publicly available data sets that cover a variety of scholarly domains, the book delves into detailed case studies. Focusing on textual data analysis, the authors explore such diverse topics as network analysis, genre theory, onomastics, literacy, author attribution, mapping, stylometry, topic modeling, and time series analysis. Exercises and resources for further reading are provided at the end of each chapter.

An ideal resource for humanities students and scholars aiming to take their Python skills to the next level, Humanities Data Analysis illustrates the benefits that quantitative methods can bring to complex research questions.

  • Appropriate for advanced undergraduates, graduate students, and scholars with a basic knowledge of Python
  • Applicable to many humanities disciplines, including history, literature, and sociology
  • Offers real-world case studies using publicly available data sets
  • Provides exercises at the end of each chapter for students to test acquired skills
  • Emphasizes visual storytelling via data visualizations
Über den Autor
Folgert Karsdorp, Mike Kestemont, and Allen Riddell
Inhaltsverzeichnis
  • Preface
  • I Data Analysis Essentials
  • Chapter 1: Introduction
    • 1.1 Quantitative Data Analysis and the Humanities
    • 1.2 Overview of the Book
    • 1.3 Related Book
    • 1.4 How to Use This Book
      • 1.4.1 What you should know
      • 1.4.2 Packages and data
      • 1.4.3 Exercises
    • 1.5 An Exploratory Data Analysis of the United States՚ Culinary History
    • 1.6 Cooking with Tabular Data
    • 1.7 Taste Trends in Culinary US History
    • 1.8 America՚s Culinary Melting Pot
    • 1.9 Further Reading
    • Chapter 2: Parsing and Manipulating Structured Data
      • 2.1 Introduction
      • 2.2 Plain Text
      • 2.3 CSV
      • 2.4 PDF
      • 2.5 JSON
      • 2.6 XML
        • 2.6.1 Parsing XML
        • 2.6.2 Creating XML
        • 2.6.3 TEI
      • 2.7 HTML
        • 2.7.1 Retrieving HTML from the web
      • 2.8 Extracting Character Interaction Networks
      • 2.9 Conclusion and Further Reading
      • Chapter 3: Exploring Texts Using the Vector Space Model
        • 3.1 Introduction
        • 3.2 From Texts to Vectors
          • 3.2.1 Text preprocessing
        • 3.3 Mapping Genres
          • 3.3.1 Computing distances between documents
          • 3.3.2 Nearest neighbors
        • 3.4 Further Reading
        • 3.5 Appendix: Vectorizing Texts with NumPy
          • 3.5.1 Constructing arrays
          • 3.5.2 Indexing and slicing arrays
          • 3.5.3 Aggregating functions
          • 3.5.4 Array broadcasting
        • Chapter 4: Processing Tabular Data
          • 4.1 Loading, Inspecting, and Summarizing Tabular Data
            • 4.1.1 Reading tabular data with Pandas
            • 4.2 Mapping Cultural Change
            • 4.2.1 Turnover in naming practices
            • 4.2.2 Visualizing turnovers
          • 4.3 Changing Naming Practices
            • 4.3.1 Increasing name diversity
            • 4.3.2 A bias for names ending in 𝑛
            • 4.3.3 Unisex names in the United States
          • 4.4 Conclusions and Further Reading
          • II Advanced Data Analysis
          • Chapter 5: Statistics Essentials: Who Reads Novels?
            • 5.1 Introduction
            • 5.2 Statistics
            • 5.3 Summarizing Location and Dispersion
              • 5.3.1 Data: Novel reading in the United States
            • 5.4 Location
            • 5.5 Dispersion
              • 5.5.1 Variation in categorical values
            • 5.6 Measuring Association
              • 5.6.1 Measuring association between numbers
              • 5.6.2 Measuring association between categories
              • 5.6.3 Mutual information
            • 5.7 Conclusion
            • 5.8 Further Reading
            • Chapter 6: Introduction to Probability
              • 6.1 Uncertainty and Thomas Pynchon
              • 6.2 Probability
                • 6.2.1 Probability and degree of belief
              • 6.3 Example: Bayes՚s Rule and Authorship Attribution
                • 6.3.1 Random variables and probability distributions
              • 6.4 Further Reading
              • 6.5 Appendix
                • 6.5.1 Bayes՚s rule
                • 6.5.2 Fitting a negative binomial distribution
              • Chapter 7: Narrating with Maps
                • 7.1 Introduction
                • 7.2 Data Preparations
                • 7.3 Projections and Basemaps
                • 7.4 Plotting Battles
                • 7.5 Mapping the Development of the War
                • 7.6 Further Reading
              • Chapter 8: Stylometry and the Voice of Hildegard
                • 8.1 Introduction
                • 8.2 Authorship Attribution
                  • 8.2.1 Burrows՚s Delta
                  • 8.2.2 Function words
                  • 8.2.3 Computing document distances with Delta
                  • 8.2.4 Authorship attribution evaluation
                • 8.3 Hierarchical Agglomerative Clustering
                • 8.4 Principal Component Analysis
                  • 8.4.1 Applying PCA
                  • 8.4.2 The intuition behind PCA
                  • 8.4.3 Loadings
                • 8.5 Conclusions
                • 8.6 Further Reading
                • Chapter 9: A Topic Model of United States Supreme Court Opinions, 1900–2000
                  • 9.1 Introduction
                  • 9.2 Mixture Models: Artwork Dimensions in the Tate Galleries
                  • 9.3 Mixed-Membership Model of Texts
                    • 9.3.1 Parameter estimation
                    • 9.3.2 Checking an unsupervised model
                    • 9.3.3 Modeling different word senses
                    • 9.3.4 Exploring trends over time in the Supreme Court
                  • 9.4 Conclusion
                  • 9.5 Further Reading
                  • 9.6 Appendix: Mapping Between Our Topic Model and Lauderdale and Clark (2014)
                  • Epilogue: Good Enough Practices
                  • Bibliography
                  • Index
  • Details
    Erscheinungsjahr: 2021
    Fachbereich: Allgemeines
    Genre: Geschichte, Importe
    Rubrik: Geisteswissenschaften
    Thema: Lexika
    Medium: Buch
    Inhalt: Einband - fest (Hardcover)
    ISBN-13: 9780691172361
    ISBN-10: 0691172366
    Sprache: Englisch
    Einband: Gebunden
    Autor: Riddell, Allen
    Karsdorp, Folgert
    Kestemont, Mike
    Hersteller: Princeton University Press
    Verantwortliche Person für die EU: Libri GmbH, Europaallee 1, D-36244 Bad Hersfeld, gpsr@libri.de
    Maße: 261 x 179 x 30 mm
    Von/Mit: Allen Riddell (u. a.)
    Erscheinungsdatum: 12.01.2021
    Gewicht: 1,175 kg
    Artikel-ID: 118870361