JAVA server performance issues resolved

  Abstract: JAVA server performance issues resolved 

  : 

  JAVA server performance issues resolved 

  Through load testing and analysis to improve the performance of server applications JAVA 

  Author: Ivan Small 

  Translator: xMatrix 

  Disclaimer: Matrix authorized access to any site, reprint, you must identify the form of hyperlinks to the original source and author of the article and the information statement Author: Ivan Small; xMatrix 
  Original Address: http://www.javaworld.com/javaworld/jw-02-2005/jw-0207-server-p3.html 
  Chinese Address: http://www.matrix.org.cn/resource/article/43/43998_server_capacity.html 
  Key words: server capacity 

  Abstract 

  JAVA servers to improve the performance needs of the server load simulation.    Create a simulation environment, and data collection and analysis may result in the development of many challenges.    Examples of this article introduced the JAVA server performance analysis of the concepts and tools.    The author used this example to study the request over the next memory use and the number of simultaneous competitive impact. 
  Author Ivan Small 

  The project team has been very familiar with how to organize a number of specific tasks and complete them.    Simple performance problems very easily by a developer isolated and resolved.    However, the performance issues, usually in a system under high load conditions, it is not so simple to deal with it.    These issues need an independent test environment, a simulated load, and the need for careful analysis and tracking. 

  In this article, I use the more generic tools and equipment to create a test environment.    I will focus on two performance issues, memory, and synchronization, it is difficult for them by a simple analysis.    Adoption of a specific examples, I hope that the relatively easy to solve complex performance issues and can provide in the course of dealing with the details. 

  Improve server performance 

  Improve the performance of the server is dependent on the data.    No reliable data base application or environment changes will lead to even worse results.    Analyzer provides useful information JAVA application server, but from the single-user load data and multi-user loads the data is completely different, which led analyzer data is not accurate.    In the development stage of the use of the analyzer to optimize application performance is a good way, but the high load of the application could be taken to better results. 

  In the load of server applications need the performance of some of the basic elements: 
  1, controllable and applied load testing environment. 
  2, controllable artificial load makes the application full-load operation. 
  3, from the monitors, application and load testing tools to collect their own data. 
  4, performance, change tracking. 

  Do not underestimate a final demand (performance tracking) the importance of tracking performance because if not you will not be practical management projects.    10-20% performance improvement on the single-user environment, and there is no different, but the support staff, not the same way.    20% of the improvement is very large, but also through improved tracking performance, you can provide important feedback and continuous tracking. 
  Although the performance tracking is very important, but sometimes in order to make the tests more accurate follow-up and had to abandon the previous test results.    In performance testing, load testing to improve the accuracy of simulation environment may need to be modified, and these changes are a must, through changes before and after the load testing to which you can observe the change. 

  Controllable environment and controllable environment also needs at least two independent machines and third-control machines.    One of Taiwan used to generate load, as another control aircraft with the establishment of a test application and receive feedback, and the third machine to run applications.    In addition, load and the application of machinery and LAN network should be separate.    Control aircraft accept feedback run applications such as machine operating system, hardware utilization, application (especially VM) state. 

  Load simulated usually the most accurate simulation of the actual user data and WEB server-side access logs.    If you do not have the actual deployment, or the lack of actual user data, you can structure similar scenes or ask sales and product management team, or do something based on conjecture.    Coordination load testing and the actual user experience is an ongoing process. 

  Some users in the simulation scene is essential.    If a common address book application, you should distinguish between update and query operation.    In my test applications GrinderServlet class only one of the scenes.    10 single-user connection to the servlet (in between each visit there is a moratorium).    Although the application of small, I think it can repeat some of the most common things.    Users do not normally connect to the server requests without interruption.    If not stopped, we might not be able to get more accurate real users ceiling. 

  Serial 10 Another reason for the request is not only in the practical application of an HTTP request.    And the request of a single separation can affect many factors in the environment.    The Tomcat, a request for the creation of a conversation, and the HTTP protocol allows reuse at the request of different connections.    I will amend the load testing to avoid mixing Wei. 

  GrinderServlet will not implement any sort of operation, but the demand in the majority of applications are very common.    In these applications, you need to create simulated data sets and use them to construct cases related with the load testing. 

  For example, if you use case involving a WEB user applications, from the list of potential users were randomly selected users will use only one user more precise.    Otherwise, you might inadvertently use of the system cache or other optimization or some subtle things, and this will make incorrect results. 

  Load testing software load test software and test scenarios can be constructed to service load testing.    I will use the following examples OpenSTA testing software.    This software easy to learn, the results can also easily export, and support parameters of the script, the information can also monitor the changes in his main shortcoming is based on Windows, but here is not a problem.    Of course, there are many options available such as the Apache JMeter and Mercury LoadRunner. 

  The GrinderServlet 

  Table 1 shows that the GrinderServlet class, the list shows that the two types Grinder 
  Listing 1 

  Package pub.capart; 

  Import java.io. *; 
  Import java.util .*; 
  Import javax.servlet .*; 
  Import javax.servlet.http .*; 

  Public class GrindServlet extends HttpServlet ( 
  Protected void doGet (HttpServletRequest req, HttpServletResponse res) 
  Throws ServletException, IOException ( 
  Grinderv1 grinder = Grinderv1.getGrinder (); 
  Long t1 = System.currentTimeMillis (); 
  Grinder.grindCPU (13); 
  Long t2 = System.currentTimeMillis (); 

  PrintWriter pw = res.getWriter (); 
  Pw.print ( "    \ N <body> \ n "); 
  Pw.print ( "Grind Time =" + (t1-t2)); 
  Pw.print ( "<body> \ n </ html> \ n"); 
  ) 
  ) 

  Listing 2 

  Package pub.capart; 

  / ** 
  * This is a simple class designed to simulate an application consuming 
  * CPU, memory, and contending for a synchronization lock. 
  * / 
  (Public class Grinderv1 
  Private static Grinderv1 singleton = new Grinderv1 (); 
  Private static final String randstr = 
  "This is just a random string that I'm going to add up many many times"; 

  Public static Grinderv1 getGrinder () ( 
  Return singleton; 
  ) 
  Public synchronized void grindCPU (int level) ( 
  StringBuffer sb = new StringBuffer (); 
  String s = randstr; 
  For (int i = 0; i    Sb.append (s); 
  S = getReverse (sb.toString ()); 
  ) 
  ) 
  Public String getReverse (String s) ( 
  StringBuffer sb = new StringBuffer (s); 
  Sb = sb.reverse (); 
  Return sb.toString (); 
  ) 
  ) 

  Class is very simple, but they will have two very common problems.    See yesterday a bottleneck may be grindCPU () method of synchronization Xiuchifu cause, but in reality memory consumption is the real crux of the problem.    Figure 1, my first load test showed that the common load changes.    Changes in the load is very important here because you are a high-load simulation.    This warm-up approach also avoids the more accurate because of the problems arising from the JSP compiler.    I usually used in load testing before a single-user simulation. 

  Figure 1 

  In this article I will use the same capacity Summary Fig.    In the implementation of load tests and more information available, but here only a useful part.    The top of the second panel included at the request of several completed and requests time information.    The second panel includes activities users and the failure rate, I will be overtime, incorrect server response and longer than five seconds at the request of that is a failure.    The third panel includes JVM memory and CPU usage statistics.    CPU processor of all time users, on average, all the test machines here are the dual CPU.    Figures include memory table and recycling of garbage collection per second. 

  Figure 1 in the two most obvious data is 50% of the CPU usage and the use of large amounts of memory and release.    2 from the list can be seen in this reason.    Synchronous Serial Xiuchifu all processes leading to handle, just like a CPU, a large number of algorithms and memory consumption on the local variables. 

  CPU is limited by the resources, if this test I can take full advantage of the CPU so that the two can be doubled performance.    Garbage collector operating in the so frequent that can not be ignored.    In the second test release of 100 M memory, it is clear this is a limiting factor.    So obvious failure of this application is not used. 

  Surveillance 

  In the generation of reasonable user loads, monitoring tools need to collect the operations of the process.    In my test environment can gather all the useful information: 

  1, all computers, network equipment 
  2, and so on the utilization of 
  3, JVM statistics. 
  4, the method of individual JAVA the time spent. 
  5, database performance information, 6, including SQL query statistics. 
  7, and other application-related information 

  Of course, these surveillance will also affect load testing, but if relatively small impact can be ignored.    Basically, if we want to obtain all the information above, will definitely affect test performance.    But if not an access to all information it is still possible to guarantee the effectiveness of load testing.    Only specific way to set up timers, the only access to low load hardware information and access to low-frequency sample data.    Of course, do not monitor loading test is the best, and then monitors the load and do comparison tests.    Although sometimes intrusive surveillance is a good idea, but there will be no monitoring of the results. 

  Access to all monitoring data to a central controller to do analysis is the best, but the use of dynamic run-time tool can also provide useful information.    For example, the command-line tools such as PS, TOP, VMSTAT UNIX machines can provide the information; Performance Monitor WINDOWS machine tools can provide the information; and TeamQuest, BMC Patrol, SGI's Performance Co-Pilot, and ISM's PerfMan such tools in all testing machines installed in the environment will require agents and the information back to the central controller, making it possible to offer text or visual information.    In this paper, I use the revenue Performance Co-Pilot test statistics as a tool.    I found him to test the smallest environmental impact, and a relatively direct way to provide data. 

  JAVA Analyzer to provide a lot of information, but usually load testing, the impact is too big and not too much useless.    Tools can even let you load on the server to do some analysis, but this is very easy to test invalid.    In these tests, I activated a detailed refuse collection to collect information memory.    I also use jconsole and jstack tools (included in the J2SE 1.5) for the screening of high-load VM.    I have no reservations about these test case load in the test results because I think these data is not quite correct. 

  Synchronous bottleneck 

  In the diagnosis of the problem at the server thread information is very useful, especially on issues such as synchronization.    Jstack tools can be connected to the operation of the process and save every thread stack information.    In the UNIX system can be used to preserve signal of 3-thread stack information, in the WINDOWS system console can be used Ctrl-Break.    In the first test, jstack pointed out that many threads in grindCPU () method was blocked. 

  You may have noted List 2 grindCPU () method is not synchronous Xiuchifu must.    I delete a test after his Figure 2 shows 

  Figure 2 

  In Figure 2, you will notice a decline in performance.    Although I used more CPU, but the throughput and the failure of a few are even worse.    Although the garbage collection cycle is changed, but the second still need 100 M recovery.    Clearly, we have not yet found a major bottleneck. 
  Non-competitive compared to the synchronization simple function call is very time-consuming.    Competitive synchronization more time-consuming, because in addition to synchronous memory needs, VM also need to protect the waiting threads.    Under such circumstances, the fact that these costs should be less than memory bottleneck.    In fact, by eliminating the synchronization bottleneck, VM memory systems have more pressure eventually led to even worse throughput, even if I used more CPU.    Obviously the best way is the biggest bottleneck from the start, but sometimes it is not very easy to quantify.    Of course, to ensure that adequate treatment VM memory normal direction is a good start. 

  Memory bottlenecks 

  Now, I will also begin positioning memory problems.    List 3 is GrinderServlet version of the Reconstruction, the use of a StringBuffer examples.    Figure 3 shows the test results. 

  Listing 3 

  Package pub.capart; 

  / ** 
  * This is a simple class designed to simulate an application consuming 
  * CPU, memory, and contending for a synchronization lock. 
  * / 
  (Public class Grinderv2 
  Private static Grinderv2 singleton = new Grinderv2 (); 
  Private static final String randstr = 
  "This is just a random string that I'm going to add up many many times"; 
  Private StringBuffer sbuf = new StringBuffer (); 
  Private StringBuffer sbufrev = new StringBuffer (); 

  Public static Grinderv2 getGrinder () ( 
  Return singleton; 
  ) 
  Public synchronized void grindCPU (int level) ( 
  Sbufrev.setLength (0); 
  Sbufrev.append (randstr); 
  Sbuf.setLength (0); 
  For (int i = 0; i    Sbuf.append (sbufrev); 
  Reverse (); 
  ) 
  Return sbuf.toString (); 
  ) 

  Public String getReverse (String s) ( 
  StringBuffer sb = new StringBuffer (s); 
  Sb = sb.reverse (); 
  Return sb.toString (); 
  ) 
  ) 

  Figure 3 

  StringBuffer reuse usually is not a good idea, but I am only here to return some common problems, not quantity provide a solution.    Memory data has disappeared from the map because the test did not run garbage collector.    The dramatic increase in throughput and CPU usage has returned to the 50 per cent.    Table 3 has been optimized not only memory, but I think that the main improvement excessive memory consumption. 

  View synchronization bottleneck 

  Table 4 GrinderServlet another version of the Reconstruction, has a small pool of resources.    Figure 4 shows the test results. 
  Listing 4 

  Package pub.capart; 

  / ** 

  * This is just a dummy class designed to simulate a process consuming 
  * CPU, memory, and contending for a synchronization lock. 
  * / 
  (Public class Grinderv3 
  Private static Grinderv3 grinders []; 
  Private static int grinderRoundRobin = 0; 
  Private static final String randstr = 
  "This is just a random string that I'm going to add up many many times"; 
  Private StringBuffer sbuf = new StringBuffer (); 
  Private StringBuffer sbufrev = new StringBuffer (); 

  (Static 
  Grinders = new Grinderv3 [10]; 
  For (int i = 0; i    Grinders [i] = new Grinderv3 (); 
  ) 
  ) 
  Public synchronized static Grinderv3 getGrinder () ( 
  Grinderv3 g = grinders [grinderRoundRobin]; 
  GrinderRoundRobin = (grinderRoundRobin +1)% grinders.length; 
  Return g; 
  ) 
  Public synchronized void grindCPU (int level) ( 
  Sbufrev.setLength (0); 
  Sbufrev.append (randstr); 
  Sbuf.setLength (0); 
  For (int i = 0; i    Sbuf.append (sbufrev); 
  Reverse (); 
  ) 
  Return sbuf.toString (); 
  ) 
  Public String getReverse (String s) ( 
  StringBuffer sb = new StringBuffer (s); 
  Sb = sb.reverse (); 
  Return sb.toString (); 
  ) 
  ) 

  Figure 4 

  The increase in throughput to a certain extent, but use fewer CPU resources.    Competitive and non-competitive synchronization are time-consuming, but usually the largest simultaneous consumption is to reduce the system's scalability.    I am no longer satisfied with the load testing system needs, so I increased the number of subscribers to virtual, as shown in Figure 5. 

  Figure 5 

  In Figure 5 throughput reaching saturation in the load was reduced by some and then decrease when the load has increased.    Also note that testing makes CPU usage of 100%, which means that more than test the best throughput of the system.    Load testing is an output of plans to load applications than his capacity will have a lower throughput. 

  The level of scalability 

  Telescopic allow for greater levels of performance, but not necessarily related to costs.    Running on multiple servers and the applications running relatively normally in a single VM complexity of the application.    However, the level of support for scalable performance of the largest increases. 

  Figure 6 is my last test results.    I have three basic agreement on the use of the machine load balancing, but in the memory and CPU speed slightly different.    The overall throughput to three times higher than the stand-alone results, and CPU has never been fully utilized.    In Figure 6 I only show a machine on the CPU results, the other is the same. 

  Figure 6 

  Summary 

  I spent nine months of the deployment of a complex JAVA application, but do not have the time to do performance plans.    However, the poor performance almost allows users contract termination.    Developers to use analyzer spent a lot of time with a few minor problems but did not solve the fundamental bottleneck, but the question of follow-up was totally confused.    Finally, load testing to find solutions, but you can think of the situation. 

  I bump another of the more difficult issues, applications can only reach the expected performance of 1 / 100.    However, by early detection and awareness to the issue of the necessity to load testing, this issue will soon be resolved.    Load testing for the whole of the cost of software development is not, but their return to avoid the high risk of more. 

  About the author 
  Ivan Small has 14 years of experience in software development.    He LBNL Supernovae Cosmology Project developed from the beginning of his career.    The project is led anti-gravity theory and unlimited expansion of the universe was one of the two projects.    Since then his work in data mining and enterprise JAVA applications.    Now he is nnovative Interfaces company's chief software engineer. 

  Resources 
  Javaworld.com: javaworld.com 
  Matrix-Java developer community: http://www.matrix.org.cn/ 
  JAVA Xingnaidiaoyou second edition: http://www.amazon.com/exec/obidos/ASIN/0596003773/javaworld 
  Concurrent programming technology: JAVA programming with the second edition: http://www.amazon.com/exec/obidos/ASIN/0201310090/javaworld 
  JAVA Site Analysis: JAVA web site performance analysis: http://www.amazon.com/exec/obidos/ASIN/0201844540/javaworld 
  JAVA performance: JAVA platform for high-performance computing: http://www.amazon.com/exec/obidos/ASIN/0130161640/javaworld 
  JAVA2 performance and terminology Guide: http://www.amazon.com/exec/obidos/ASIN/0130142603/javaworld 
  BEA WebLogic Server Xingnaidiaoyou, contains useful general information: BEA WebLogic J2EE application server performance test: http://www.amazon.com/exec/obidos/ASIN/1904284000/javaworld 
  JAVA Xingnaidiaoyou: http://www.javaperformancetuning.com 
  JAVA excessive synchronization: "lightweight threads": http://www-106.ibm.com/developerworks/java/library/j-threads1.html 
  Load and performance testing tools: http://www.softwareqatest.com/qatweb1.html # LOAD 

  Java, java, J2SE, j2se, J2EE, j2ee, J2ME, j2me, ejb, ejb3, JBOSS, jboss, spring, hibernate, jdo, struts, webwork, ajax, AJAX, mysql, MySQL, Oracle, Weblogic, Websphere, scjp, scjd 
  ↑ Back 

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Facebook
  • DotNetKicks
  • DZone
  • Netvouz
  • Propeller

Tags: , ,

Releated Java Articles

Comments

Leave a Reply