關(guān)于我們
書(shū)單推薦
新書(shū)推薦
|
R科學(xué)計(jì)量數(shù)據(jù)可視化(第二版) 本書(shū)詳細(xì)介紹了意大利那不勒斯菲里德里克第二大學(xué)Massimo Aria和Corrado Cuccurullo基于R語(yǔ)言開(kāi)發(fā)的BIBLIOMETRIX工具包。該R工具包基本上涵蓋了進(jìn)行科學(xué)計(jì)量和知識(shí)可視化的功能,可以滿足愛(ài)好R軟件,并試圖使用R進(jìn)行科學(xué)計(jì)量和知識(shí)圖譜分析的讀者。在此基礎(chǔ)上,本書(shū)對(duì)于科學(xué)計(jì)量與知識(shí)圖譜相關(guān)的一些R工具包,包括rAltmetric、wordcloud2、gender以及tidytext等工具包進(jìn)行了簡(jiǎn)要介紹。 Preface We heard about bibliometrics 10 years ago for the first time. In 2008 Corrado was writing a monograph on fast growing firms, a niche theme, which he approached for the first time. Scientific literature was fairly limited. Scholars came from different disciplines with a variety of approaches and methods that made it difficult to cumulate the findings. We talked about this research problem once during a football match among scholars. Our discussion continued for several days on the various techniques of systematic analysis of literature. We enjoyed the exchange and concluded that bibliometrics was an interesting method and that it would have been fun to explore it together. Our goal became to examine the intellectual structure of fast growing firms research. We analyzed all the scientific production published in academic English-written journals. The analysis was complex because it required several steps and diverse analysis and mapping software tools, which were often available only under commercial licenses. All the process was unwieldly, from data-collection to data-visualization. Massimo greatly contributed with his statistical and coding skills. Our collaboration continued in moments of fun, such as our frequent football matches. While analyzing data, we discovered that we enjoyed working together. In short, our friendship soon turned into a scientific collaboration that still lasts. Within our departments and academic communities, the reaction to our work was positive. At that time, few people talked about bibliometrics in Italy, even from the point of view of research evaluation. Years later we presented a bibliometric analysis paper on performance management at the Annual Conference of the Academy of Management, the largest international management meeting. Also on that occasion, we got positive feedbacks that pushed us to persist. In the same years, young Italian colleagues asked us for suggestions for their literature reviews and for their research. Massimo opened some statistical analysis laboratories in R and together we presented the bibliometric analysis at some workshops. We are telling this story because without these feedbacks and stimuli we would not have published the bibliometrix release 0.1 in 2016. A year later we are at version 1.7, thanks to our growing passion for bibliometrics and to the suggestions that today come from scholars from all around the world. R-bibliometrix is currently a free tool for quantitative research in scientometrics and bibliometrics that includes all the main bibliometric methods of analysis, easy to use even for those who have no coding skills. Bibliometrix is a unique tool, developed in the statistical computing and graphic R language, according to a logical bibliometric workflow. R is highly extensible because it is an object-oriented and functional programming language, and therefore is pretty easy to automate analyses and create new functions. As it has an open-software nature, it is also easy to get help from the users community, mainly composed by prominent statisticians. Therefore, bibliometrix is flexible and can be rapidly upgraded and can be integrated with other statistical R-packages. That why, it is useful in a constantly changing science such as bibliometrics. Today bibliometrix is more than just a statistical tool. It is becoming a community of international developers and users who exchange questions, impressions, opinions, and examples within an open source project. For this reason, we are very honored that Dr Jie Liof the Research centerfor Safety and security SCITECH trends at the Department of Safety Science and Engineering, Shanghai Maritime University gave us the opportunity to tell you this story and to write an English preface for his book “Using R for Scientometrics data Visualization” that mainly introduces the BIBLIOMETRIX package to scholars and students. We said that Bibliometrix includes all the main bibliometric methods of analysis, but we use it especially for science mapping and not for measuring science, scientists, or scientific productivity. Synthesizing past research findings is one of the most important tasks in advancing a line of research. Various methods exist to summarize the amount of scientific activity in a domain, but bibliometrics has the potential to introduce a systematic, transparent and reproducible review process. This is very relevant in an age when the number of academic publications is rising at a very fast pace and it is increasingly unfeasible to keep track of everything that is being published; and when the emphasis on empirical contributions is resulting in voluminous and fragmented research streams, and a contested feld. Literature reviews are increasingly playing a crucial role in synthesizing past research findings to effectively use the existing knowledge base, advance a line of research, and give evidence-based insights into the practice of exercising and sustaining professional judgment and expertise. The overwhelming volume of new information, conceptual developments and data are the milieu in which bibliometrics becomes useful, by providing a structured analysis to a large body of information, to infer trends over time, themes researched, identify shifts in the boundaries of the disciplines, to detect most the prolifc scholars and institutions, and to show the “big picture” of extant research. Naples, Italy July 2017 Massimo Aria and Corrado Cuccurullo 前言 當(dāng)前,我們正處于科學(xué)文獻(xiàn)大數(shù)據(jù)時(shí)代。面對(duì)海量的文獻(xiàn),我們?nèi)绾慰焖俚亓私庖粋(gè)研究領(lǐng)域、研究方向或者主題的整體格局以及未來(lái)的趨勢(shì)?在此背景下,與該問(wèn)題直接相關(guān)的科學(xué)計(jì)量理論、方法和技術(shù)的適時(shí)發(fā)展,成為解決上述科研問(wèn)題的一種有效的途徑。掌握與科學(xué)計(jì)量相關(guān)的技術(shù)和方法也成為科研工作者在新時(shí)代進(jìn)行科學(xué)研究活動(dòng)的基本技能要求。在過(guò)去十余年里,科學(xué)計(jì)量數(shù)據(jù)可視化的理論與方法已經(jīng)大量地滲透到其他學(xué)科的研究實(shí)踐中。在國(guó)內(nèi),這種以科學(xué)文本數(shù)據(jù)為研究對(duì)象,通過(guò)可視化技術(shù)來(lái)揭示學(xué)科結(jié)構(gòu)、演進(jìn)和互動(dòng)的研究領(lǐng)域被統(tǒng)稱(chēng)為“科學(xué)知識(shí)圖譜”。 科學(xué)計(jì)量數(shù)據(jù)可視化背后涉及大量的科學(xué)計(jì)量學(xué)(還包含文獻(xiàn)計(jì)量學(xué)、網(wǎng)絡(luò)計(jì)量學(xué)以及信息計(jì)量學(xué))方面的基礎(chǔ)理論,比如論文的作者生產(chǎn)率分布、論文的共被引、耦合、主題共現(xiàn)以及作者合作等。還包含了統(tǒng)計(jì)學(xué)和網(wǎng)絡(luò)科學(xué)等方面的技術(shù)和方法,比如多維尺度分析、聚類(lèi)分析、復(fù)雜網(wǎng)絡(luò)分析、自然語(yǔ)言處理和文本挖掘等分析方法。上述理論和方法構(gòu)成了進(jìn)行科學(xué)計(jì)量數(shù)據(jù)可視化分析的知識(shí)基礎(chǔ),是進(jìn)行知識(shí)圖譜分析的前提。在理論和方法的支持下,當(dāng)前國(guó)內(nèi)外的相關(guān)學(xué)者已經(jīng)開(kāi)發(fā)了數(shù)十種科技文本挖掘方面的軟件或者工具包,這些知名的工具包含了HistCite、BibExcel、CiteSpace、SCI2以及VOSviewer等。這些工具為有意借助領(lǐng)域文獻(xiàn)分析以獲取學(xué)科研究格局和動(dòng)態(tài)的學(xué)者提供了可能。 筆者在過(guò)去5年從事科學(xué)計(jì)量和知識(shí)圖譜的實(shí)踐研究中,相繼撰寫(xiě)了關(guān)于CiteSpace、VOSviewer以及BibExcel等方面的書(shū)籍,主要目的在于幫助非科學(xué)計(jì)量學(xué)領(lǐng)域的學(xué)者快速應(yīng)用該方法輔助科學(xué)研究。從2016年開(kāi)始,已經(jīng)相繼組織了4次與科學(xué)計(jì)量和知識(shí)圖譜相關(guān)的活動(dòng),與來(lái)自國(guó)內(nèi)的數(shù)百名知識(shí)圖譜愛(ài)好者有過(guò)交流。在交流中,最為常見(jiàn)和令我反思的一個(gè)問(wèn)題是:“我得到的圖譜結(jié)果應(yīng)該怎樣解釋呢?”我認(rèn)為,科學(xué)計(jì)量及知識(shí)圖譜的方法僅僅給我們提供了一種認(rèn)識(shí)知識(shí)世界的新方式,但這種認(rèn)識(shí)方式更需要知識(shí)圖譜實(shí)踐者結(jié)合自身的專(zhuān)業(yè)背景和知識(shí)圖譜的理論與方法去思考。在進(jìn)行科學(xué)計(jì)量和知識(shí)圖譜分析的時(shí)候,讀者一定要明確自己要解決的問(wèn)題是什么,以及為什么知識(shí)圖譜能夠解決提出的問(wèn)題,它與其他方法相比優(yōu)勢(shì)在哪里,等等。即在進(jìn)行科學(xué)計(jì)量和知識(shí)圖譜分析之前,一定要確定自己所要研究的問(wèn)題,然后來(lái)選擇使用何種知識(shí)圖譜呈現(xiàn)方式解決問(wèn)題。 本書(shū)是《CiteSpace:科技文本挖掘及可視化》《科學(xué)計(jì)量與知識(shí)網(wǎng)絡(luò)分析——基于BibExcel等軟件的實(shí)踐》《科學(xué)知識(shí)圖譜原理及應(yīng)用——VOSviewer與CiteNetExplorer初學(xué)者指南》的姊妹篇。與前面這些應(yīng)用程序不同的是,該書(shū)詳細(xì)介紹了意大利那不勒斯菲里德里克第二大學(xué)(University of Naples Federico II)經(jīng)濟(jì)與統(tǒng)計(jì)系Massimo Aria和Corrado Cuccurullo基于R語(yǔ)言開(kāi)發(fā)的BIBLIOMETRIX工具包。建議讀者在應(yīng)用時(shí)通過(guò)提供的鏈接來(lái)檢查是否為最新版的BIBLIOMETRIX,在實(shí)際的研究中盡可能使用最新版來(lái)對(duì)數(shù)據(jù)進(jìn)行分析(BIBLIOMETRIX-R Package for Bibliometric and Co-Citation Analysis,http://www.bibliometrix.org/)。該R工具包基本上涵蓋了進(jìn)行科學(xué)計(jì)量和知識(shí)可視化的功能(圖0. 1),可以滿足愛(ài)好R軟件,并試圖使用R進(jìn)行科學(xué)計(jì)量和知識(shí)圖譜分析的讀者。在此基礎(chǔ)上,對(duì)于科學(xué)計(jì)量與知識(shí)圖譜相關(guān)的一些R工具包,如rAltmetric、wordcloud2、gender以及tidytext等工具包進(jìn)行了介紹。本書(shū)對(duì)使用R進(jìn)行英文全文本挖掘的介紹很少,對(duì)中文全文本挖掘尚未涉及。在今后的更新中將對(duì)使用R進(jìn)行全文本挖掘進(jìn)行適當(dāng)?shù)耐晟啤?/p> 圖0. 1bibliometrix功能概覽 為了便于讀者熟悉bibliometrix工具包,本書(shū)大多數(shù)的案例運(yùn)行采用了工具包自帶的數(shù)據(jù),一些案例專(zhuān)門(mén)下載了Web of Science和Scopus數(shù)據(jù)集并進(jìn)行了分析。案例中呈現(xiàn)了所分析的結(jié)果,但并未就結(jié)果進(jìn)行描述性或者帶有特定研究目的的解讀。讀者通過(guò)對(duì)這些結(jié)果的學(xué)習(xí),自己去思考可以做些什么,或者至少可以通過(guò)這種方法了解自己所關(guān)注領(lǐng)域的基本情況。 本書(shū)在撰寫(xiě)中有如下約定: >后為代碼 #為代碼的說(shuō)明 ##為代碼運(yùn)行的結(jié)果 感謝Massimo Aria和Corrado Cuccurullo,他們?cè)诒緯?shū)寫(xiě)作過(guò)程中給予了大力幫助,并為本書(shū)撰寫(xiě)了英文序言。感謝首都經(jīng)濟(jì)貿(mào)易大學(xué)出版社楊玲社長(zhǎng)對(duì)科學(xué)計(jì)量與知識(shí)圖譜系列叢書(shū)出版的極力支持,感謝中國(guó)科學(xué)院李彬彬博士在提取子矩陣問(wèn)題上的幫助,感謝滑鐵盧大學(xué)博士后于淼對(duì)文稿提出的修改建議,感謝本書(shū)的責(zé)任編輯薛曉紅以及研究生李平對(duì)本書(shū)的編輯和詳細(xì)校對(duì)。 回首自己在科學(xué)計(jì)量和知識(shí)圖譜研究與實(shí)踐上的經(jīng)歷,感受五味雜陳。衷心地期望本書(shū)及相關(guān)系列叢書(shū)能進(jìn)一步促進(jìn)科學(xué)計(jì)量與知識(shí)圖譜實(shí)踐研究在國(guó)內(nèi)的發(fā)展和普及,并使每一位讀者受益。 李杰 2018年5月于北京 李杰, 博士/博士后,1987年生于陜西。現(xiàn)為中國(guó)科學(xué)院文獻(xiàn)情報(bào)中心副研究員,研究領(lǐng)域?yàn)榭茖W(xué)計(jì)量學(xué)與安全科學(xué)。擔(dān)任Journal of Integrated Security and Safety Science共同主編、《安全與環(huán)境學(xué)報(bào)》青年編委會(huì)副主任、Safety Science等期刊編委,全國(guó)科學(xué)計(jì)量學(xué)與信息計(jì)量學(xué)專(zhuān)業(yè)委員會(huì)委員。發(fā)表學(xué)術(shù)論文60余篇,出版了《CiteSpace:科技文本挖掘及可視化》、《科學(xué)知識(shí)圖譜原理及應(yīng)用》、《科學(xué)計(jì)量與知識(shí)網(wǎng)絡(luò)分析》以及《R科學(xué)計(jì)量數(shù)據(jù)可視化》等著作6部。 目錄 第1講R基礎(chǔ) 1 1.1R下載 1 1.2R安裝 3 1.3Rstudio安裝 5 1.4安裝包 6 1.5加載包 8 1.6包幫助 8 1.7引用包 9 1.8包數(shù)據(jù)調(diào)用 10 1.9用戶(hù)數(shù)據(jù)加載 12 1.10編程錯(cuò)誤 13 第2講科學(xué)計(jì)量數(shù)據(jù)采集 14 2.1WoS數(shù)據(jù) 14 2.2Scopus數(shù)據(jù) 17 2.3PubMed數(shù)據(jù) 19 第3講R科學(xué)計(jì)量分析基礎(chǔ) 21 3.1R數(shù)據(jù)轉(zhuǎn)換 21 3.2數(shù)據(jù)列名的意義 22 3.3數(shù)據(jù)集合并 23 3.4數(shù)據(jù)的除重 25 3.5數(shù)據(jù)的切片 26 3.6數(shù)據(jù)的編輯 27 3.7描述性分析 28 3.8統(tǒng)計(jì)可視化 33 3.9引文信息分析 36 3.10Altmetric信息 38 3.11作者排名分析 39 3.12作者性別判斷 40 3.13h類(lèi)指數(shù) 42 3.14Lotka分析 44 3.15知識(shí)單元時(shí)序分布 46 3.16文獻(xiàn)與作者LCS計(jì)算 50 3.17被引次數(shù)標(biāo)準(zhǔn)化 52 3.18術(shù)語(yǔ)提取 54 第4講R科學(xué)數(shù)據(jù)可視化 58 4.1知識(shí)單元隸屬矩陣 58 4.2知識(shí)單元共現(xiàn)矩陣 60 4.3隸屬矩陣的子矩陣 63 4.4共現(xiàn)矩陣的子矩陣 64 4.5共現(xiàn)矩陣標(biāo)準(zhǔn)化 66 4.6網(wǎng)絡(luò)的可視化 67 4.7VOSviewer的可視化 70 4.8合作網(wǎng)絡(luò)可視化 71 4.9耦合網(wǎng)絡(luò)可視化 75 4.10共被引網(wǎng)絡(luò)可視化 76 4.11歷史引證網(wǎng)絡(luò)分析 78 4.12共詞網(wǎng)絡(luò)可視化 80 4.13術(shù)語(yǔ)概念結(jié)構(gòu)圖 83 4.14語(yǔ)義地圖分析 86 4.15主題演化可視化 89 4.16詞云可視化 93 4.17PuMed數(shù)據(jù)可視化 96 4.18全文本挖掘及可視化 97 4.19高產(chǎn)作者動(dòng)態(tài) 105 4.20耦合網(wǎng)絡(luò)戰(zhàn)略坐標(biāo)圖 106 4.21參考文獻(xiàn)時(shí)間可視化 108 4.22分割網(wǎng)絡(luò)圖 110 第5講網(wǎng)頁(yè)版R-biblioshiny 113 5.1數(shù)據(jù)導(dǎo)入與格式轉(zhuǎn)化(Data) 114 5.2數(shù)據(jù)篩選(Filter) 115 5.3數(shù)據(jù)集主要信息(Dataset) 116 5.4出版源信息(Sources) 119 5.5作者信息(Authors) 122 5.6文檔信息(Documents) 127 5.7聚類(lèi)(Clustering) 132 5.8概念結(jié)構(gòu)(Conceptual Structure) 133 5.9認(rèn)知結(jié)構(gòu)(Interllectual Structure) 138 5.10社會(huì)結(jié)構(gòu)(Social Structure) 140 第6講上機(jī)實(shí)驗(yàn) 141 6.1特定作者的論文計(jì)量 141 6.2特定論文的科學(xué)計(jì)量 152 6.3特定機(jī)構(gòu)的論文計(jì)量 163 6.4特定期刊的比較計(jì)量 175 6.5特定會(huì)議論文的計(jì)量 192 6.6特定主題文獻(xiàn)的計(jì)量 203 6.7特定方法文獻(xiàn)的計(jì)量 219 參考文獻(xiàn) 230 附錄 232 附錄1R科學(xué)計(jì)量核心代碼 232 附錄2Web of Science核心字段含義 237 附錄3常用的科學(xué)計(jì)量數(shù)據(jù)可視化工具 239 附錄4R科學(xué)計(jì)量數(shù)據(jù)可視化工具包 240
你還可能感興趣
我要評(píng)論
|