Research on Online Learning Behavior Analysis Model in Big Data Environment

In this study, on the basis of summarizing the current situation of online learning behavior and related theoretical research. Based on the analysis of related research results, considering the existing problems, the main contents of this paper include the following aspects: (1) Define the connotation of online learning behavior, and introduce the theory of artificial intelligence into the classification of online learning behavior from structural dimension, functional dimension and mode dimension; (2) According to the overall architecture of the analysis model, the analysis model is constructed from left to right and top to down under the big data environment. The online learning behavior data model is constructed from the multi-dimensional and multi-level perspective to determine the source, method and process of data collection. After that, designs the horizontal and longitudinal processes of the online learning behavior analysis model. On this basis, using the big data processing technology on the online learning behavior analysis model in all aspects of the specific algorithms involved in the implementation. (4) We chose the online learning platform of China Business Executives Academy, Dalian to do empirical analysis. In-depth study from the following three aspects: the learning behavior clustering analysis based on K-means algorithm, the individualized course recommendation analysis based on Page Rank algorithm and the correlation analysis of learning effects.


INTRODUCTION
The rapid development of Internet technology and education information technology has speeded up people's learning and changed the way of thinking and cognitive. Online learning model quickly rise and has been widely recognized. This new way of learning and education model, will drive the education of information technology reform and innovation. The rapid development of online learning also faces some challenges, lower course completion rate and the user's loss of the phenomenon is occurred frequently in the online learning platform. In order to find out the reasons for the formation of this situation, in the context of big data, analyzing the large number of users' learning behavior data recorded in the online learning platform. Through tracking the learning behaviors generated by the users in the learning process and get the analysis results, we can provide some guidance for teachers and platform managers to monitor and interpose learners to learn. Therefore, on the basis of summarizing the current situation of online learning behavior and related theoretical research, this paper has following aspects: Firstly, defines the connotation of online learning behavior, and introduce the theory of artificial intelligence into the classification of online learning behavior from structural dimension, functional dimension and mode dimension. On the basis of associated factors and driving force, this paper provides the overall architecture of online learning behavior analysis model under big data environment.
Secondly, according to the overall architecture of the analysis model, the analysis model is constructed from left to right and top to down under the big data environment. The online learning behavior data model is constructed from the multi-dimensional and multi-level perspective to determine the source, method and process of data collection. After that, designs the horizontal and longitudinal processes of the online learning behavior analysis model. On this basis, using the big data processing technology on the online learning behavior analysis model in all aspects of the specific algorithms involved in the implementation.
Thirdly, we chose the online learning platform of China Business Executives Academy, Dalian (CBEAD) to do empirical analysis. In-depth study from the following three aspects: The learning behavior clustering analysis based on K-means algorithm, the individualized course recommendation analysis based on Page Rank algorithm and the correlation analysis of learning effects. And according to the analysis results provide the application effect and revelation.

State of the literature
• Research on online learning platforms has been relatively perfect, but the analysis of online learning behavior in large data environments is still in its infancy.
• In theory and practice research also has certain achievements, but there are still some shortcomings, such as the data acquisition platform users online learning behavior lack of pertinence, mostly collect data directly from the database, how to extract the valuable data item is an important basic work of learning behavior analysis. At the same time, the analysis method of online learning behavior analysis model is limited. Contribution of this paper to the literature • The concept and connotation of online learning behavior are defined. The categories of online learning behavior are classified according to the theory of artificial intelligence. Research under the environment of big data associated factors of online learning behavior and behavior analysis of driving force, the building principles of behavior analysis model and theoretical basis, on the basis of the analysis model for online learning behavior to carry on the overall architecture.
• According to the general architecture of analysis model, from two aspects of multi-dimensional and multilevel first constructs the data model of online learning behavior, realize the online learning behavior from the viewpoint of multiple data storage and record, data acquisition and analysis provides the guidance for the next step work, secondly analyses the model of vertical and horizontal process design process design.
• According to the analysis model of the behavior, the corresponding method is analyzed. For the clustering analysis of online learning behavior, first identify the classifications and make a cluster operation. For personalized course recommended analysis, to determine the target users are learning cluster, and learn to find a representative users, will give the user the highest course recommend to target learners; To analysis the correlation of learning behavior and learning effect, first of all, clear learning behavior index attributes and attribute reduction, the use of the improved Apriori algorithm for extraction of decision rules.
• In online learning platform of China Business Executives Academy, Dalian (CBEAD) (E class) as the research object, using the learning behavior analysis model to analyze the corresponding framework, analysis result feedback to learners, teachers, education researchers and managers platform and application effect is given and the enlightenment.

LITERATURES REVIEWING
Considering the characteristics of online learning behavior within big data environment, therefore, many scholars have explored big data development, online learning platform, online learning behavior, and have obtained rich theoretical and practical results. At present, the study of online learning behavior in big data environment is mainly focused on three aspects, as Figure 1 shows.
(1) Research on big data. Although the concept of big data was developed early, the development of its technology was in its infancy. So far, the main areas of big data technology include visual analysis, data mining algorithms, semantic engine, data quality and data management. Google's Mapreduce model, for example, focuses on large data sets. In the education area, the University of Purdue uses big data technology to build learning early warning mechanisms by collecting data from students in the course. The main focus of current scholars on big data research is the extraction of value from data sets. To discover knowledge from the data and use it to guide people's decisions, the data must be analyzed deeply. Social media big data is a hot area of analysis of big data.
(2) Research on online learning platform. Online Learning platform or network Learning platform, network teaching platform, it's by providing an open, Shared teaching environment, to support the student to carry on the online course Learning, foreign called e -Learning platform. According to the platform for the user object is different, the online learning platform can be divided into two categories, namely for a profit of customization platform and provide free course counseling non-profit platform. Such as Moodle is the world's most widely used free online learning platform, not only collected from worldwide well-known excellent courses in university, also offers a variety of online communication tools, all kinds of learners for learning. In recent years, the typical representative on the online learning platform is the massive open online course, also known as MOOC.
(3) Research on online learning behavior. A lot of research is involved in the behavior, the research object and the selection of sample has certain limitation, while some studies concerning online learning behavior was conducted under the intervention of the researchers. Kenneth studied the influence of learning behavior and reflective learning in online business courses, Prior has analyzed the impact of online learning behaviors from three aspects: learning attitude, information literacy and self-efficacy, Butcher had studied the different levels of prior knowledge and the relationship between online learning behavior, the results show that a higher level of prior knowledge can lead to learners at a deeper level of learning behavior.

RESEARCH DESIGN
In determining online learning behavior on the basis of analyzing the goal, on the basis of behavioral science, system when it comes to learning theory as the theoretical basis, build a data model and data collection online learning behavior, from left to right, since the downward after analysis and modeling for online learning behavior; Following the analysis model and the model of online learning process, the analysis model is divided into three parts: cluster analysis, recommendation analysis and correlation analysis. According to the process of problem solving, the online learning behavior analysis model longitudinal process is divided into data processing, method selection and analysis process, result output and so on, as Figure 2 shows.
(1) Correlation factor analysis of online learning behavior. Online learning behavior as a multidimensional complex system, provide for the elements that influence the detailed analysis, to identify the characteristics of learning, learning motivation, and style, to help teachers and managers to design a reasonable teaching structure and teaching strategies, and later to clustering learning behavior, learning resources to  recommend and to investigate the effect and study the correlation between provides guidance to help. When analyzing the influence factors, not only the internal factors of the learner, but also external factors including learning environment, supporting system and teaching mode are also considered. The internal factors include the learner's information literacy, the learning motivation, the learner's original subject knowledge, the learning style and the learner's self-efficacy. External factors include online teaching patterns, online learning resources, online learning support systems, and teacher education skills.
(2) Research online learning behavior motivation. This includes requirements and interest motivation, technical motivation, and data motivation. Requirements and interest motivation include online learners, course organizers, platform managers, and the education researchers. Technical motivation includes education data mining, learning analysis, and so on. Data motivation includes data volumes, data collection, and data storage.
(3) Online learning behavior related analysis. The learning behavior is mainly divided into external and implicit. Among them, most of the things that can be recorded by the platform are explicit learning behaviors, such as browsing, searching, saving, etc. With learners' learning motivation, learning, reflection and mental activities such as learning and memory behavior related to belong to implicit learning behavior, this article mainly analysis the is under the environment of big data can be recorded online learning platform of explicit learning behavior. Online learning behavior as a complex system, we can classify from multiple dimensions, then formed a complete and comprehensive online learning behavior classification system.

EVALUATED MEASUREMENTS
(1) Online learning behavior cluster analysis based on k-means method. Firstly, we identified the category indicators for online learning behavior RFL. This paper distinguishes clients from three behavioral variables: Recency, Frequency, and Monetary. For research study of the classification of the users' online learning platform based on RFM analysis method on the basis of the classification index system was proposed based on RFL online learners. By analyzing the online learning process and the support service environment provided by the platform, the various index variables in the RFM analysis method were redesigned. The RFL classification index system built in this article is shown in Table 1.
On the basis of design classification index system, using the K -means algorithm to cluster analysis of behavior, the analysis process includes the following steps: data normalization process, determine the learner behavior feature weights and K -means clustering algorithm. In the platform, in order to analyze which categories of learners of course good loyalty, which categories of learners are lower course loyalty is the trend of the current online education research. Online courses according to the learners' learning can be intuitive and clear understanding of the learners' learning attitude, therefore in building online learning behavior based on classification index, through calculating the weight of each index in each category, and according to the formula to calculate the online course loyalty given learners of loyalty points of the course, to help teachers and platform managers take timely teaching strategies and teaching modes, increase the heat of the online learning. The formula for loyalty based on online learning behavior is: Here, stands for R weights, stands for F weights, stands for L weights, ′ stands for mean value of R, ′ stands for mean value of F, ′ stands for mean value of L.
(2) The analysis of personalized curriculum recommendations based on online learning behaviors. Online learning platform, recommend personalized courses need according to the result of clustering online learning behavior, could be divided into different cluster behavior characteristics, learners to identify the target learners of clusters; Second, the Page Rank algorithm is used to identify one or more representative learners. Finally, the target

5680
user is recommended to be the most highly rated course resource. When a user is learning in a platform, there is a lot of interaction with the course. Through learning behavior of learners' scores of set, to clear the learners to learn about the course, specific learning behavior of the scores for: 0 -browse, 1 -collecting, 2 -registration, 3 -watching, 4-evaluation, 5 -testing , 6 -take an examination, 7 -sharing, 8 -download, Table 2 presents is part of the learner to the grade evaluation of the courses. According to online learning behavior, when a learner scores a maximum number of points on a course, it is the subject that learners are most likely to study.
Page Rank algorithm is presented and Map Reduce framework for the extraction of typical learners, every node in the Page Rank algorithm in the initial first before they were given a Page Rank value, representation for ( ). If there is a border between any two nodes, there is a vote between the two nodes, then we use ( ) to stands for the number of edge which send from . To calculate the number of votes between any two nodes, the number of points that exist is the number of points that can be reached between the nodes, the expression for the contribution of node to node is ( )/ ( ). Similarity calculation can through user behavior characteristic value between cosine similarity is concluded, the platform user behavior characteristics of variables with a set of vectors, as = ( , , ). Through the goal the learner's behavior characteristic vector and some students in group behavior characteristic vector comparison, namely the target learners = � , , � with some students in the group = ( , , ) line � , � into the cosine similarity calculation. The formula is � , �is larger, it means that the match is both high and representative. The similarity matrix M is obtained by the above analysis. On the basis of drawing the similarity between the learner and the target, find the representative set that can represent the target. If � , � ≥ is in , the learner can represent the learner and call it � , � = 1; If � , � < , there is no representation between learner and learner . Then we called it as � , � = 0. This is based on the learner similarity matrix to represent the matrix . The number of representative learners calculated by and is = { | ( , ) ≥ , ∈ }. The initial set of the learner's data set, the Page Rank of each subset of D, is 1. If one of the learner is in the representative set that corresponds to the target learner, the learner can vote on and figure out how many times he appears that is .
Then has = 1 for each of the delegates. The sum of all the votes cast by the votes for each round of the votes for is obtained as The sum of the total votes is the representative learner that the target person recommends the course resources. The pseudo-code is extracted based on the representative information of Page Rank as following shows: the input item is Learner data set is = ( 1 , 2 , … , ), representative coefficient is , parameter is ∈ [0, 1]. The output item is () = ∅, it means make a statement on the representative set of learners, [] = compute_Similarity ( ), it means Calculate the similarity to get the matrix M.
[][] = Compute_represent(M), it means the matrix is represented by a given representation coefficient.
[]= Compute_represent_sets( ), it comes up with a collection of representative learners.
[] = Compute_vote_value( , ), it means to calculate the number of votes per representative set. Max_vote_value=Find_max( ), it means to find the maximum number of votes. With online learning behavior data have big data characteristics, the behavior of large amount of data and variety, use only pure Page Rank typical learner information extraction algorithm can cause algorithm is slow, take great response processing time. Therefore, in the context of big data, the core Map Reduce framework in cloud computing is introduced to improve the efficiency of the representative learners. The map function in the map reduce framework, by constructing 〈 , 〉, breaks the data into separate chunks of data, and calculates the number of of each index set and the number of votes. The Reduced function is then combined by the same key value to calculate the number of votes. The existing Map Reduce parallel processing technology is developed by using open source source code on Hadoop to enable the implementation of the Page Rank algorithm.
(3) The association rule analysis of online learning behavior and learning effects. Online learners' learning effect is affected by many learning behavior factors, in order to in-depth to explore the influence factors of learning effect, need to reflect the whole learning process of a large number of data mining and analysis. In order to analyze the relationship between behavior and effect, the learning process needs to be resolved. In the current study, learning behavior mostly based on the "registration, along with the test, homework, listening to lectures, class discussion, examination, certificate" of the basic flow, specific online learning process is shown in Figure 3. In the course of online learning, the learning behavior itself is ambiguous, and the behavior object does not exist. A relationship is only a general or special relationship. For the properties of such objects, the need for a fuzzy bold is required. The rough set is processed, but it is similar to the classic rough set model, which is susceptible to noise data. Therefore, scholars have established the VPFRS model to solve the above problems. The improved model can deal with the data object of the fuzzy system and greatly improve the anti-interference ability, but it is affected by the change of the parameter value. In this paper, with the aid of fuzzy implication operator classic set contains (4) Empirical research on online learning platform of China Business Executives Academy, Dalian (CBEAD). The results extracted from the data collected from the previous text were collected from the leadership course. This includes the beginning of the course. Study data for the period of May 9, 2017. In order to ensure the objective and effective results of the experiment, 700 learners were randomly selected. Each learner's learning is measured in terms of learning frequency, learning frequency and length of study. The data corresponding to the index. Partial statistics are shown in Table 3.
Former the K means clustering algorithm based on the steps, the indicators for access to study data were normalized processing, objective method of entropy value method to determine index weight. By calculation, the weight of learning index was 0.225, the learning frequency weight was 0.564, and the learning length weight was 0.211. Learners in determining the next category on the basis of the number is 8, with the aid of mathematical statistics software SPSS to K -means the operation of the clustering process, finally it is concluded that belongs to the eight types of learners of the course, divided into categories as shown in Table 4. It can be seen from the above analysis results of each learner categories index average difference is very big, in order to get accurate classification of learners will all kinds of don't comparing with overall average index. Final experiment seven levels of online learners, respectively defined as important development, General important, General learners, Important to keep, Important to retain, worthless, watchers. Use the pie chart in Excel to visualize a number of other people, as shown in Figure 4.

CONCLUSION AND RECOMMENDATION
Based on the analysis of big data environment based on the main problems of online learning, on the basis of behavioral science and artificial intelligence theory of the concept and classification of online learning behavior is defined, the analysis of the large data online learning behavior under the environment of associated factors, build online learning behavior analysis model, design from behavioral clustering, personalized recommendation and association rule mining process of longitudinal analysis of online learning behavior. Finally chose China Business Executives Academy, Dalian (CBEAD) online learning platform for large data under the environment of online learning behavior has carried on the empirical research, and gives the online learning behavior analysis of application effect and enlightenment. The results of this study are as follows: Firstly, from the evolution of artificial intelligence theory and the theory system of relations between the three dimensions, and found it can satisfy the demands of online learning behavior classification dimension choice, from the structure, function and mode of three dimension reveals the behavior of the classification and analysis process, which includes behavior of clustering analysis, recommend analysis and correlation analysis, thus forming a complete and comprehensive G：watchers system of behavior analysis; Secondly, according to the online learning platform in the learning process, the use of RFM analysis method established in the field of commercial behavior classification RFL index system, through the platform of the user's learning close degree, frequency and length of the index to describe the behavior characteristic, completed the classification index system build based on user behavior; Thirdly, from the Angle of the rough set theory to big data environment behavior analysis of online learning as a fuzzy decision object, the introduction of fuzzy Bayesian rough set on online learning behavior the attribute reduction. Then, using the improved Apriori algorithm, the association rules of learning behavior and learning effect were extracted. Finally, through online learning behavior analysis of the concrete implementation method and process, in the online learning behavior under the condition of big data clustering analyses, recommend analysis and correlation analysis, learning assessment and intervention for platform users, teachers' teaching decision-making and improve the change of teaching and education researchers and managers platform to platform monitoring and management of four angles to provide the related suggestions and enlightenment.