Interfacing corpus linguistics and computational stylistics

2013 ◽  
Vol 18 (2) ◽  
pp. 254-280 ◽  
Author(s):  
Lukasz Grabowski

This study attempts to examine the potential of selected corpus linguistics and computational stylistics methods in the investigation of translation universals in translational literary Polish. It deals with T-universals (Chesterman 2004), with emphasis on the simplification hypothesis, as manifested in the core patterns of lexical use (Laviosa 1998) and the levelling out hypothesis (Baker 1996). To that end, the purpose-designed corpora, each with approximately 350,000 tokens, of contemporary translational and non-translational literary Polish were compiled. The results confirm the simplification and the levelling out hypotheses but only with reference to the mean sentence length and variance for the mean sentence length. On the other hand, the results of multivariate analyses (Principal Components Analysis and Cluster Analysis) confirm the levelling out hypothesis that translations are more alike as compared with native texts.

1997 ◽  
Vol 48 (7) ◽  
pp. 969 ◽  
Author(s):  
Sarita Jane Bennett

Genetic variation between and within populations of Trifolium glomeratum (cluster clover) was studied using seed collected from 2 sites in Western Australia: Mount Barker in the south and Kwelkan in the wheatbelt. Seed was collected at 64 subplots within each site and the material was grown at the University Field Station at Shenton Park, Perth. Seventeen morphological characters were scored and the results were analysed using analysis of variance, principal components analysis, and cluster analysis. Within-site variation was much greater than had previously been shown, and a considerable amount of between-site variation was present. It is suggested that within-site variation is due to a small amount of heterozygosity, as a result of limited outbreeding, being present in each population. The 2 populations are shown to be distinct from each other, with the population from Mount Barker containing more within-site variation. It is suggested that this is a result of climatic stress influencing and reducing the amount of variation being maintained in the Kwelkan population.


1994 ◽  
Vol 7 (3-4) ◽  
pp. 175-180 ◽  
Author(s):  
H. Förstl ◽  
R. Levy ◽  
A. Burns ◽  
P. Luthert ◽  
N. Cairns

Thirty-seven patients with neuropathologically verified Alzheimer's disease (AD) have been studied prospectively. A principal components analysis of neuron numbers in cortical and subcortical areas revealed two variables: Variable I with high loadings for the hippocampo-parahippocampo-parietal neuron counts and Variable II with high loadings for coeruleo-frontal cell numbers. Both may reflect functional neuroanatomical connections which may act as pathways of neurodegeneration in AD. A cluster analysis based on these neuron numbers yielded three groups of patients: Cluster A with low hippocampo-parahippocampo-parietal cell counts, Cluster B with well-preserved neuron numbers, and Cluster C with low coeruleo-frontal neuron numbers. Differences in clinical features between these patient groups indicated the potential clinical relevance of these clusters.


2018 ◽  
Vol 13 (4) ◽  
pp. 439
Author(s):  
Juliao Soares de Souza Lima ◽  
Samuel Assis Silva

The quality of coffee beverages has been under study due to the demand of the consumer market for both arabica and conilon coffee. The aim of this work was to study beverage quality from different clones by means of sensory analysis, in 13 clones of the variety Victoria INCAPER 8142 produced at average altitudes of 100.0 m and 528.0 m and with the cherry fruits processed by natural drying or depulping. Fuzzy classification was adopted for the global scores obtained in the sensory analysis, on a scale of 70.0 to 100.0 points, with the Euclidean distance from the cluster analysis being used to define the dissimilarity between the global fuzzified scores for the different clones at the two altitudes and for the two methods of processing the fruit. Clones C4 and C10, at the intermediate maturation stage, presented a mean global score (GS) of 85.0 points for the coffee produced at the altitude of 528.0 m and for the depulped fruit, corresponding to a degree of fuzzy pertinence (FI) of 0.50, and being classified as fine coffee. These same clones presented dissimilarities in the beverage produced by the depulped fruit, with better quality for the coffee at the higher altitude. The fuzzy classification taken together with the cluster analysis to interpret the mean global scores (GS) in the sensory analysis of the beverage for the different treatments under study identified variation in beverage quality


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Sayed Ali HOSSEINI ◽  
Zohreh HADYANI ◽  
Hossein YAGHFOORI

Safety is a basic issue in every social system and communities consider safety as one of their main priorities. One of the most important factors that put the safety of various communities at risk is the threats caused by crime occurrence. This paper is aimed to spatially analyze crime occurrence in various regions of Iran with an emphasis on safety. The research method is descriptive-analytical and a documentary and library data collection method is used. In this paper, the Similarity, COPRAS, mean rank method, and cluster analysis method are applied. The final results of the cluster analysis based on the mean rank method indicate a wide gap between the provinces of the country in terms of survey indicators, so that the final coefficient obtained for the provinces in the sixth cluster (the most unsafe group) is about 45 times of the final coefficient of the provinces in the first cluster (the safest group).


2021 ◽  
Vol 3 (1) ◽  
pp. 0210105
Author(s):  
Rahmi Lathifah Islami ◽  
Pardomuan Robinson Sihombing

A good increasing export will yield foreign exchange to a country, and subsequently funding its country growth. In Indonesia, export is one of the biggest foreign contributors. As we can see that the countries Indonesia export to are more than 100, it is a must to group them based on their similarity. Biplot and cluster analysis are statistic methods which are used as tool to classify data based on variable explanatory. There are outliers in data acquired. Outliers are observation data which is appeared to be extremely different to the other data. Those data are identified by leverage method. in summary, this research applies K-Medians Clustering Method using Manhattan Distance to resolve outliers while grouping the countries based on their export data. The data contains export data of 182 countries in the year of 2017. R 3.5.1 software was used to calculate in this analysis. The clustering shows us that each continent form difference clusters. Asia has 4 clusters while the rest each has 3 clusters. In addition, we can conclude that several clusters have high value export of Indonesia for certain variables.


2020 ◽  
Vol 5 (2) ◽  
pp. 25-35
Author(s):  
Bruno Chauvin ◽  
Dimitra Macri ◽  
Etienne Mullet

The study was aimed at structuring the crosscountry differences in risk perception that have been reported in the literature, using cluster analysis. A 30-hazard x 19-country matrix was composed using as inputs the mean risk estimation levels available in the literature, and cluster analysis was conducted on this matrix. Six clusters of countries were found: A Communist bloc cluster (USSR and Hungary), a Nordic cluster (Finland, Norway, Sweden), an Arab cluster (Egypt and Kuwait), a Developing countries cluster (Brazil and South Korea), a Western cluster (France, Portugal, Spain, USA), and a cluster comprised of four countries or territories (Burkina Faso, China-Hong-Kong, China-Macao, Russia) which only common denominator seems to be that these countries are countries in which many economical and/or societal problems exist. The factors that may explain this clustering are discussed, and a new, more analytic approach to cross-national differences in risk perception is suggested.


2009 ◽  
pp. 81-114
Author(s):  
Ferruccio Biolcati Rinaldi ◽  
Daniele Checchi ◽  
Chiara Guglielmetti ◽  
Silvia Salini ◽  
Matteo Turri

- Abstract The paper consists of two parts. The first is more general: it introduces to university ranking, shows the leading international ranking, discusses the uses people make of rankings. The second focuses on Italian ranking Censis-la Repubblica developing two different kinds of analyses: after considering indicators validity and reliability, principal components analysis and cluster analysis are applied to a partial replication of Censis-la Repubblica data. A list of points to pay attention comes out of these analyses: it can be useful when defining rankings of complex institutions such as universities.Key words: ranking, university ranking, Censis-la Repubblica, validity and reliability, normalisation and combination of indicators.


Sign in / Sign up

Export Citation Format

Share Document