Using Information Retrieval to Construct an Intelligent E-book with Keyword Concept Map

More and more people enjoy the brand new experience an e-book brings to them, but the traditional e-book is plenty of room for improvement. One of the reasons is that the book has to be read page by page; thus, it is not easy to grasp the overall structure of such a book. As such, we propose a novel system based on the information retrieval technologies to automatically create the keyword concept map for each section of the book. Moreover, in addition to showing the context where each keyword in the concept map is located, with each keyword in the concept map is associated a hyperlink, to make it easy for a reader to move to the context associated with the keyword. Equipped with the keyword concept map and the hyperlink associated with each keyword, it can be expected that the learning achievement of the reader can be raised. Our experimental results show that the proposed e-book system with the keyword concept map can provide a better learning result than a tradition e-book does in terms of both the scores received after learning and practicing and the results of satisfaction questionnaire on learning, practicing, and system satisfaction.


INTRODUCTION
The e-book (Woody et al., 2010) typically can be regarded as a product of book after digitization which means that e-book is composed of any kinds of digitized content.A precisely definition (PC magazine encyclopedia) is that a book can be read by the interactive digital devices, such as These devices might be desktop computer, cell phone, tablet.The basic concept of interactive for e-book means that is not only the digital contents for the audience, but also a smart device.From the way to display the images and audios, interactive multimedia, to smart aids, these works usually are impossible missions for paper-book.That is why with the advance of e-book, a digitized book now is become to a variety of multimedia products (Siegenthaler et al., 2010).However, the reading styles and strategies for traditional e-book usually need to read page by page which will make the reader can not easy to grasp the overall structure of such a book.Since the number of words is not too many we can easy to understand the 6738 meaning of contents when we start with read the e-book from the first half; but most people may not be able to handle the contents of other half if contents of this book is too many (Squire, L. R., 1987).It is because our brain cannot remember everything we see and change the focus to the last part, the reader usually can not easy to comprehend the relationships of different concepts (or contents) of e-book.If the reader can not realize all the relationships of concepts, it will let him/her get some concepts like the fragments after read an e-book.Most of them will probably understand some ideas from this book, but they cannot get a big picture of this book and they also will not able to integrate these concepts.But the good news is that the contents display of e-book is not like the traditional paper-book cannot be changed, it can be added much more interactive functions to solving the problems that we cannot easy to realize all the relationships of concepts for a book.
Because the most important characteristic of e-book is it can provide the interactive using to the reader and book, it can be used to improve the learning performance of reader if concepts from different contents cannot easy be integrated to a simple concept.This paper will present a novel system by using the keyword concept map to provide an integrated viewpoint of book to make the reader can understand its overall structure.The basic idea of proposed system is try to let us can easy to understand the most things of the book very quickly by using the keywords of contents of the book when we read this book at the first time.However, the find out the keywords is difficult work if the reader is the first time to read this book.That is why the proposed system employed the term frequency-inverse document frequency (TF-IDF) (Aizawa, 2003) to calculate the frequency of word in different sections.Based on these information, the proposed then can be dynamic adjust define the frequency threshold of words to determine which words will be extracted.The proposed system then will automatically create the keyword concept map for each section of the book with the frequency of words.With this kind of e-book system at hand, the behavior that click different keyword can help the readers move the particular page they want, thus, reader then can easy to understand overall structure of such a book.The main contribution of the paper is that the proposed system will automatically create keyword concept to enhance the learning performance of reader.

Concept Maps
Since how to enhance the learning performance is a critical problem, several tools and systems have been presented in recent years.Among them, mindtools is a computer application which can be used to show learners are in which kind of situations for learning and thinking to support them organize the knowledge they have and integrate these knowledge with other knowledge (Jonassen et al., 1998;Jonassen, 1999;Jonassen & Carr, 2000).The approaches of mindtools for education typically can be divided into five categories, which are: (1) database mindtools, (2) graph mindtools, (3) concept mapping, (4) search Internet mindtools, and (5) visualization mindtools (Averill et al., 2005).The concept mapping of them is one of well-known approaches because it an effectively knowledge visualization tool to support learner to identify and understand the structure of knowledge while it can also be widely applied to several fields (Cañas, Alberto J., et al., 2005).From the 1980s presented by Novak and Gowin until now, the applications of concept map have undergone several changes and now it is become to one of useful tools for education (Erdogan, 2009).The roles of concept map for education can be the a to organize and construct the knowledge (Erdogan, 2009;Hwang, Shi, & Chu, 2011;Li, Chen, & Yang, 2013), computer assisted

Contribution of this paper to the literature
• In this paper, we propose a novel system by using a keyword concept map to provide an integrated viewpoint of book to make the reader understand its overall structure.
• The basic idea of proposed system is to try to let the reader easily understand the important concepts in the book very quickly by using the keywords of contents of the book when the reader reads the book at the first time.
• The proposed system automatically creates the keyword concept map for each section of the book using the frequency of words.With this kind of e-book system at hand, clicking a keyword can help the readers to move the particular page they want.Thus, reader then can easily understand overall structure of the e-book.
Not only the traditional way by using the pen and paper to construct the concept map, but also the modern way by using the computer technologies to develop the digital concept map, much more studies shown that the concept map can be used to enhance the learning performance for the learner (Asan 2007; Hwang et al. 2011Hwang et al. , 2013;;Yang et al. 2013).For example, Asan attempted applied the concept map to the course of natural science for the 5th grade and the results also shown that the learning performance of learner can be improved significantly when they used the concept map (Asan 2007).Another example is that Yang et al. tried to apply the concept to cell phone by scanning the QR code of paper-book to get the concept to support student to learning (Yang et al. 2013).The final results of the study (Yang et al. 2013) also shown that this idea really can improve the student to reduce the number of content they cannot fully understand on the course.Due to one of characteristics of concept map is that it can be used to show the knowledge in our mind via the image display, compare to one the only simplest text, visualizing concepts is a better way to easy understand the content.That is why most of the logos and trademarks contain the text and images at the same time (Anna, C. 2000& Asan, A. 2007).Moreover, the concept map can also help us to understand the concepts and relationships from different subjects, catch up the key points quickly, and then understand the most of things of e-book (White and Gunstone, 1992).
From the perspective on the advance of information technologies to see the changes of teaching and learning ways as well as changes the learning tool, the concept map on paper become to the digital concept map is one of representative example.The advantages of digital concept map are the main reason why the several recent studies used the digital concept map to design the relevant experimentations to replace the way to drilled via paperwork.In the study of "The effects of a concept map-based information display in an electronic portfolio system on information processing and retention in a fifth-grade science class covering the Earth's atmosphere.",Kim et al given a discussion about the difference for performance of information processing and retention by using the concept map between traditional folder-based information display and concept map-based information display.Wu et al. (2012) also presented a novel learning strategy which integrated the digital concept map and real-time assessment and feedback to improve the learning performance of students when they are use concept map on paper but the teacher cannot quick evaluate concept maps of students to further providing the applicable feedback to students.Chu et al. (2014) in the study of "A cooperative computerized concept-mapping approach to improving students' learning performance in web-based information-seeking activities" presented another solution to digitize the concept map to further improving the learning performance of student for web-based information-seeking activities.
Based on these perspectives, this study try to add the keyword to each section by the proposed system to support learner can easy to preview, learn, and review and then enhance the learning performance to reduce the number of unknown or unclearly concept of course.

Electronic Book
In the previous stage of e-book development, the most well-known understanding is that display the contents of paper-book by digital formats (Ismail & Zainab, 2005) while it need to be displayed on computer or ebook reader (Rao, 2003).Most of studies were focused on challenges and opportunities (Cox, 2004), discussion on the electronic textbook and paper textbook (Shepperd et al., 2008& Slater, 2009& Christianson et al., 2005), and discussion on the paper-book will be disappeared or not (Van der Velde et al., 2009) at that time.Even though the e-book has several useful distinguishing features (e.g., flexibility, reusability, and creativity) to attracted a large number of advocates, however, it still has some studies pointed out that the students would like to learn by textbook (i.e., paper-book) not the e-book (Woody et al., 2010).Another study (Gregory, 2008) also argued that although student will use the e-book, but they are much like to use the traditional textbook to learn the knowledge.
But the advocates of e-book and paper-book now are beginning to see these argues will be endless if we discuss them by using different perspectives.No one can be beat the other one is become to a common consensus and consequently the research focuses have been shifted from the argues of e-book and paper-book to how to applied the e-book to which field, especially in education and learning.The changes can be easy found in recent studies.The study of Korat and Shamir (2008) presented an idea by using the e-book to support the preschoolers to emergent literacy as well as Yang et al. ( 2013) attempted use the smart phone to scan the QR code to get the auxiliary materials and concept map to make the user use the smart phone to be the e-book reader to read the relevant information.
In fact, Coyle (2008) has been mentioned that the key point to increase the value of e-book is how to develop the innovative technologies to make the e-book has a variety of learning methods not just only digitizing the text books.This kind of viewpoints has become to a promising research trend gradually in recent studies.Le et al. (2013) presented a visual cue map to solve the reading problem on e-book and the experimental results also shown their solution can be used to solve the problems of e-book on reading, reviewing, and navigational performance.Yi et al. (2014) also presented an integrated solution to combine reading guidance module and annotation map on e-book to discussion the impacts of that kind of systems for college students.Moreover, the research focus of Lim and Hew ( 2014) is on the feeling of students when they use NG-eBook which has the ability to make the annotation and information sharing.In addition to the innovation, some of studies attempted make the discussion on humanity for e-book, such as the study of "Investigating E-book Reading Patterns: A Human Factors Perspective."was focused on the different cognition styles (e.g., browsing patterns, navigation facilities, and annotation patterns) to observe the learning behavior of human (Hwang et al., 2014).
With the advance of e-book (e.g., increase the value of e-book by innovation or using different analyses for cognition of human to make it more suitable for humanity), one of critical issues is how to improve the learning performance, therefore, the focus of this paper is that attempted presented a novel key concept map method to make the learner can easy and simply to understand the content of e-book within an reasonable time or reduce the learning curve.

Data Mining, Machine Learning and Information Retrieval
The document analysis technologies typically play a key role in finding the important information for the textbook, material, examination paper and even the learning behavior of students on an e-Learning system.The well-known technologies for e-book are data mining (Fayyad et al., 1996), machine learning (Dillenbourg, 1999), and information extraction and information retrieval (Baeze-Yates & Ribeiro-Neto, 1999).More precisely, the data mining and machine learning technologies can be used to find the hidden information from the textbook and learning behavior to provide the teacher additional information can easy to understand how to help the students improve the learning performance as well as can help the students know which part of contents need to review again to fully understand all the concepts of textbook.The information extraction technologies for e-book are usually used to be the data preprocess, such extract the most important contents of textbook and learning behavior of students.Generally speaking, the information extraction plays the role to reduce and filter the number of contents for information retrieval tool to avoid the redundant loading on the system.Since the main work of information retrieval for e-book is that understand the relationship between the contents on different sections or parts, vector space model (VSM) (Baeze-Yates & Ribeiro-Neto, 1999) would be one of important technologies for computing the similarity between contents or concepts.In addition to VSM, the other document similarity method can also be combined with other data mining techniques to give a complete analysis for text-book and e-book.

The Basic Idea
As mentioned in previous, the basic idea of this study is attempt design an intelligent system to build the keyword concept map for e-book.The contents of the e-book will be loaded to our proposed system.By using the preprocessing methods (e.g., remove the irrelevant terms) and TF-IDF, the proposed system can compute the similarities of contents of each section.These information will useful to recognize the structure of the e-book.The proposed system then can used these similarities and relationships to know which section will relevant to another section.Of course, it can also find out the most important keyword from these sections by using the TF-IDF.With these information (relationship and important keyword) at hand, the proposed system then can construct the keyword concept map like as a knowledge integration map (http://en.wikipedia.org/wiki/Knowledge_integration_map).The reader on our proposed system then can easy to understand which parts are relevant.Thus, the reader will much easy to plan their reading strategy or learning path.
As shown in the Figure 1, we develop an Android APP to be the e-book reader which will load the contents of the e-book.Based on relationship and important keyword, the proposed system will able to construct the keyword concept map to improve the learning performance of reader for traditional e-book.After the proposed system (Android APP) find out the keyword from the contents of e-book, it will analyze the keyword frequency for each section and find out applicable keywords by the given threshold from teacher or expert.1In other words, the high frequency or low frequency keywords must filter out.For example, if we set the range of frequency threshold is 30 to 50, the proposed system will select the keyword on this section which at the least need to be appeared more than 30 times and not to be exceed more than 50 times.The proposed system will use frequency of these keywords to construct the keyword concept map.To make the user can easy to understand this concept map, each node of the concept map will add the keywords to explain the meaning or important concept of this node (e.g., paragraph or section).Moreover, readers can jump to the section they want to go after they click the keyword near the node.

A Simple Example
A simple example is used to explain how to use the proposed system.In this example, the contents of ebook of operating system will be uses to document analysis which contains the preprocessing (i.e., information extraction) and similarity computing (i.e, information retrieval).If we set the frequency threshold between 15 to 25 times, the propose system will construct the concept map by this setting.As shown in Figure 2, the keywords are CPU、LRU、FIFO、Memory and Disk, the relationships between keywords and section provide the information to their relevance.For example, if the keywords CPU is appeared in chapter 1.1 twenty times, this information will Figure 1.The design of the proposed system be displayed on this connection edge (i.e., relationship).The descriptions will also be displayed near the keyword to explain where the keyword appeared in this section.It can be easy to found that the FIFO has been appeared in chapter 1.2, after click this node, the reader then can move the page of chapter 1.2.As a result, the reader can easy and quickly to find out the contents they need on the e-book and keyword concept map.

Participants and Experimental Procedure
The subjects are 61 students from one class of sixth grade students and then we divide them into two groups.All of these students have not been learning these materials before.One group as the students use the ebook with keyword concept map (G1) and the other group as the students use the e-book without the concept map (G2).The G1 group contains 31 students as the experimental group while G2 group contains 30 students as the control group.More precisely, the design of experimental was referred to the book of Campbell et al. [Campbell et al., 1963], the students are divided into control group and experimental group, and then use the pre-test, post-test, and test after review to understand the learning performance of students.As shown in Figure 3, it can easy to recognize that seven steps will be used for the experiment in this paper which are: (1) pre-test, (2) introduction to experiment and operator procedures, e.g., questions from students and answers for these questions, (3) use the proposed system to construct the keyword concept map for e-book, (4) all the students, G1 and G2, learning by their e-books, (5) post-test, (6) review after fourteen days, and ( 7) the survey and test after review.
As shown in Figure 3, at the first steps, the students will divided into G1 group as the experimental group and G2 group as the control group, where the students in G1 group will use the e-book with the keyword concept map and the students in G2 group will use the e-book without the keyword concept map.Also, these two groups will conduct test at the same time.At the second step, teacher will guide all the students in this experimental how to operate the e-book they will be used to learning and answer the questions from the students.At the third step, the proposed system will analyze the contents of each section and counts the frequency (i.e., the number of times that each word in this section).The teacher will use this information to dynamic adjust the scope of frequency for sampling and use the keywords provided by the proposed system to selected applicable word to be the keyword.Finally, the proposed system will use these keywords to construct the keyword concept map of each section for ebook.At the fourth step, the students of G1group will use the e-book with keyword concept map to learn the knowledge while the students of G2 group will use another kind of e-books (i.e., traditional e-book) to learn the knowledge.The time for learning is limited to 60 minutes.At the fifth step, the students of these two groups will be tested to evaluate their learning performance.At the sixth step, the students of these two groups have to review procedure via the e-book for twenty minutes after fourteen days.At the final step, all the students of these two groups need to be tested after the review procedure, and then we can evaluate the learning performance of them.In addition, all students need to complete a satisfaction questionnaire.

The Tools
Four tools are used in this paper that are: questions for pre-test, questions for G1 and G2 after use the ebook to learn the knowledge, the questions after using the e-book to review, and a satisfaction questionnaire which based on Likert 5 point to design.The exams of pre-test, test after review, and pro-test have twenty-five questions, respectively and the perfect score is one hundred.Moreover, these question is developed by the expert in this field and e-book.

The Satisfaction Questionnaire
The satisfaction questionnaire for concept map of material is attempt to know can the keyword concept map able to support the students handle the content structure when they see the new materials at the first time and also yield twice the result with half the effort for review the materials.That is why we design a satisfaction questionnaire for concept map of material to observe the learning situation of students.This questionnaire contains fourteen items and these items can be divided into two groups: one is availability and the other one is usability which from item 1 to item 11, and item 12 to item 14, respectively.The Likert 5 point is also used to evaluate the result of this questionnaire from 1 (very disagree) to 5 (very disagree).Among them, the items of availability are used to observe the students to understand is it useful for their learning while the items of usability are used to know can the students easy to operate the proposed system.

Rating Items
Strongly Agree Agree Normal Disagree Strongly Disagree (1) By using this system, I can have a better understanding of computer science.
(2) The system can help me to find my learning problems.
(3) The system can help me to understand knowledge of computer science I learn.(4) Through this system, I can think more extensible subjects of computer science.
(5) Functions provided by the system are favorable for my learning.(6) Through feedbacks of the system, I can understand more knowledge about computer science.(7) Through feedbacks of the system, I can systematize my learning to knowledge about computer science.(8) Through feedbacks of the system, I can focus on my learning (9) Feedbacks provided by the system can help me to revise the wrong idea.(10) Through feedbacks of the system, I can have a better understanding of concepts I didn't understand completely before.(11) Information provided by the system can help me to have a better learning performance.( 12) I can easily receive information through this system on mobile devices.( 13) The interface of system can be operated easily.( 14) I can quickly learn to operate this system.

Results of Learning Performance
Tables 1 and 2 show the pre-test and post-test results of the experimental control and experimental groups.In the average score of pre-test we can understand the prior knowledge of control group G2 is better than experimental group G1.But the post-test results after the experimental shown a different situation that is the average score of experimental group G1 is better than control group G2.More precisely, the average score of experimental groups G1 is 81.68 which is better than the average score of control group G2 72.93 with p-value 0.027.These results shown that by using the keyword concept map will able to improve the learning performance significantly.In the other side, after fourteen days of the post-test, the students of these two groups will use their e-book to review and complete the test after the review, respectively.The results also shown that the students (G1) used the keyword concept map for the review can get better average score than the student of G2 group who do not use the keyword concept map, i.e., the average scores of G1 is 83.13 and G2 is 75.33 with p-value 0.014.It means that if the e-book provide keyword concept map to support the student to review can significant to enhance the learning performance of students.In summary, the students can get better performance for learning and review when the traditional e-book adopts the keyword concept map.

Results of the Satisfaction Questionnaire
Table 3 shows the results of the satisfaction questionnaire after the experimental that the user experience of all students for the proposed system is positive.First for the availability, it can easy to understand that this study integrated the keyword concept map to traditional e-book can enhance the availability of e-book while can make the user feel the proposed system can improve the learning performance much more than traditional e-book.Thus, the experimental group can get the average score 4.0155 which is better than the average score of control group 3.568.
In the other side for the usability, we recognized that the average score of experimental group G1 4.19 is similar to the average of control group 4.005.It means that the e-book is very mature today, the interface and using of e-book today are much better than early e-books.The e-book nowadays is much friendly than before.As a result,  the difference on usability for the experimental and control groups is not significantly.In summary, the goal of the keyword concept map is used to support the student for traditional e-book on availability.The results also matched the assumption of this study.

CONCLUSIONS AND DISCUSSIONS
This study proposed a way by using keyword concept map to improve the learning performance of traditional e-book.The experimental subject is the students of sixth grade students by using the paper and pen to know the performance of proposed system for learn and review.These result response the Coyle (Coyle, 2008) that the vital value of e-book is depends on how to find out the applications that we have not to pay close attention for them.That is why the relevant issues of e-book still attracted the attention of researcher form different disciplines.It also because that when the traditional paper-book or traditional e-book matured, the innovation idea for e-book still can improve the e-book and become a popular research domain.For these reasons, this study tried to provide a novel system to create the keyword concept map to strengthen the traditional e-book to further making it can easy to understand.The results of learning performance with no doubt shown that not only the learning but also the review can be significant improved by using the proposed system to learn.The findings indicated that the students by using the e-book with keyword concept map as the experimental group can get better average score of test than the students used the traditional e-book without keyword concept which as the control group.Moreover, the results of satisfaction questionnaire also show that the experimental group is better than control group for the availability.An interesting result is that results of usability for these two different groups are very similar.According our observation, it means that because the e-book today is matured, thus, even though the traditional ebook the user experience will not be terrible.It means that only take into account the interface of e-book or using behavior would not get the significant result for user feeling.But if we only consider the result of experimental, the usability of experimental group still better than control group which means that it still has the chance to improve it and match our expected.In the future work, we will try to improve the flexibility of the propose system, i.e., user can choice multiple section to create the concept map not just only can choice one section each time, and try to find or develop better methods to accurate choice the keyword automatically to avoid the loading of teacher to further make the proposed system can create the concept map fully automatically on overall process.

Figure 2 .
Figure 2. A simple example of the proposed system

Table 1 .
The pre-test results of the experimental control and experimental groups

Table 2 .
The post-test results of the experimental control and experimental groups

Table 3 .
The results of the satisfaction questionnaire