summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2009-01-29trace_sched_wakeup: Remove unused variableArnaldo Carvalho de Melo
Impact: cleanup Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29trace: Use tracing_reset_online_cpus in more placesArnaldo Carvalho de Melo
Impact: cleanup Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-27Merge branches 'tracing/blktrace', 'tracing/kmemtrace' and 'tracing/urgent' ↵Ingo Molnar
into tracing/core
2009-01-26blktrace: add ftrace pluginArnaldo Carvalho de Melo
Impact: New way of using the blktrace infrastructure This drops the requirement of userspace utilities to use the blktrace facility. Configuration is done thru sysfs, adding a "trace" directory to the partition directory where blktrace can be enabled for the associated request_queue. The same filters present in the IOCTL interface are present as sysfs device attributes. The /sys/block/sdX/sdXN/trace/enable file allows tracing without any filters. The other files in this directory: pid, act_mask, start_lba and end_lba can be used with the same meaning as with the IOCTL interface. Using the sysfs interface will only setup the request_queue->blk_trace fields, tracing will only take place when the "blk" tracer is selected via the ftrace interface, as in the following example: To see the trace, one can use the /d/tracing/trace file or the /d/tracign/trace_pipe file, with semantics defined in the ftrace documentation in Documentation/ftrace.txt. [root@f10-1 ~]# cat /t/trace kjournald-305 [000] 3046.491224: 8,1 A WBS 6367 + 8 <- (8,1) 6304 kjournald-305 [000] 3046.491227: 8,1 Q R 6367 + 8 [kjournald] kjournald-305 [000] 3046.491236: 8,1 G RB 6367 + 8 [kjournald] kjournald-305 [000] 3046.491239: 8,1 P NS [kjournald] kjournald-305 [000] 3046.491242: 8,1 I RBS 6367 + 8 [kjournald] kjournald-305 [000] 3046.491251: 8,1 D WB 6367 + 8 [kjournald] kjournald-305 [000] 3046.491610: 8,1 U WS [kjournald] 1 <idle>-0 [000] 3046.511914: 8,1 C RS 6367 + 8 [6367] [root@f10-1 ~]# The default line context (prefix) format is the one described in the ftrace documentation, with the blktrace specific bits using its existing format, described in blkparse(8). If one wants to have the classic blktrace formatting, this is possible by using: [root@f10-1 ~]# echo blk_classic > /t/trace_options [root@f10-1 ~]# cat /t/trace 8,1 0 3046.491224 305 A WBS 6367 + 8 <- (8,1) 6304 8,1 0 3046.491227 305 Q R 6367 + 8 [kjournald] 8,1 0 3046.491236 305 G RB 6367 + 8 [kjournald] 8,1 0 3046.491239 305 P NS [kjournald] 8,1 0 3046.491242 305 I RBS 6367 + 8 [kjournald] 8,1 0 3046.491251 305 D WB 6367 + 8 [kjournald] 8,1 0 3046.491610 305 U WS [kjournald] 1 8,1 0 3046.511914 0 C RS 6367 + 8 [6367] [root@f10-1 ~]# Using the ftrace standard format allows more flexibility, such as the ability of asking for backtraces via trace_options: [root@f10-1 ~]# echo noblk_classic > /t/trace_options [root@f10-1 ~]# echo stacktrace > /t/trace_options [root@f10-1 ~]# cat /t/trace kjournald-305 [000] 3318.826779: 8,1 A WBS 6375 + 8 <- (8,1) 6312 kjournald-305 [000] 3318.826782: <= submit_bio <= submit_bh <= sync_dirty_buffer <= journal_commit_transaction <= kjournald <= kthread <= child_rip kjournald-305 [000] 3318.826836: 8,1 Q R 6375 + 8 [kjournald] kjournald-305 [000] 3318.826837: <= generic_make_request <= submit_bio <= submit_bh <= sync_dirty_buffer <= journal_commit_transaction <= kjournald <= kthread Please read the ftrace documentation to use aditional, standardized tracing filters such as /d/tracing/trace_cpumask, etc. See also /d/tracing/trace_mark to add comments in the trace stream, that is equivalent to the /d/block/sdaN/msg interface. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26ftrace: add ftrace_vprintkArnaldo Carvalho de Melo
Impact: new helper function Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26kmemtrace: fix printk format warningsRandy Dunlap
Fix kmemtrace printk warnings: kernel/trace/kmemtrace.c:142: warning: format '%4ld' expects type 'long int', but argument 3 has type 'size_t' kernel/trace/kmemtrace.c:147: warning: format '%4ld' expects type 'long int', but argument 3 has type 'size_t' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26Merge branches 'tracing/ftrace' and 'tracing/function-graph-tracer' into ↵Ingo Molnar
tracing/core
2009-01-23tracing/function-graph-tracer: various fixes and featuresFrederic Weisbecker
This patch brings various bugfixes: - Drop the first irrelevant task switch on the very beginning of a trace. - Drop the OVERHEAD word from the headers, the DURATION word is sufficient and will not overlap other columns. - Make the headers fit well their respective columns whatever the selected options. Ie, default options: # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 1) 0.646 us | } 1) | mem_cgroup_del_lru_list() { 1) 0.624 us | lookup_page_cgroup(); 1) 1.970 us | } echo funcgraph-proc > trace_options # tracer: function_graph # # CPU TASK/PID DURATION FUNCTION CALLS # | | | | | | | | | 0) bash-2937 | 0.895 us | } 0) bash-2937 | 0.888 us | __rcu_read_unlock(); 0) bash-2937 | 0.864 us | conv_uni_to_pc(); 0) bash-2937 | 1.015 us | __rcu_read_lock(); echo nofuncgraph-cpu > trace_options echo nofuncgraph-proc > trace_options # tracer: function_graph # # DURATION FUNCTION CALLS # | | | | | | 3.752 us | native_pud_val(); 0.616 us | native_pud_val(); 0.624 us | native_pmd_val(); About features, one can now disable the duration (this will hide the overhead too for convenient reasons and because on doesn't need overhead if it hasn't the duration): echo nofuncgraph-duration > trace_options # tracer: function_graph # # FUNCTION CALLS # | | | | cap_vm_enough_memory() { __vm_enough_memory() { vm_acct_memory(); } } } And at last, an option to print the absolute time: //Restart from default options echo funcgraph-abstime > trace_options # tracer: function_graph # # TIME CPU DURATION FUNCTION CALLS # | | | | | | | | 261.339774 | 1) + 42.823 us | } 261.339775 | 1) 1.045 us | _spin_lock_irq(); 261.339777 | 1) 0.940 us | _spin_lock_irqsave(); 261.339778 | 1) 0.752 us | _spin_unlock_irqrestore(); 261.339780 | 1) 0.857 us | _spin_unlock_irq(); 261.339782 | 1) | flush_to_ldisc() { 261.339783 | 1) | tty_ldisc_ref() { 261.339783 | 1) | tty_ldisc_try() { 261.339784 | 1) 1.075 us | _spin_lock_irqsave(); 261.339786 | 1) 0.842 us | _spin_unlock_irqrestore(); 261.339788 | 1) 4.211 us | } 261.339788 | 1) 5.662 us | } The format is seconds.usecs. I guess no one needs the nanosec precision here, the main goal is to have an overview about the general timings of events, and to see the place when the trace switches from one cpu to another. ie: 274.874760 | 1) 0.676 us | _spin_unlock(); 274.874762 | 1) 0.609 us | native_load_sp0(); 274.874763 | 1) 0.602 us | native_load_tls(); 274.878739 | 0) 0.722 us | } 274.878740 | 0) 0.714 us | native_pmd_val(); 274.878741 | 0) 0.730 us | native_pmd_val(); Here there is a 4000 usecs difference when we switch the cpu. Changes in V2: - Completely fix the first pointless task switch. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-23trace, lockdep: manual preempt count adding for local_bh_disableSteven Rostedt
Impact: fix to preempt trace triggering lockdep check_flag failure In local_bh_disable, the use of add_preempt_count causes the preempt tracer to start recording the time preemption is off. But because it already modified the preempt_count to show softirqs disabled, and before it called the lockdep code to handle this, it causes a state that lockdep can not handle. The preempt tracer will reset the ring buffer on start of a trace, and the ring buffer reset code does a spin_lock_irqsave. This calls into lockdep and lockdep will fail when it detects the invalid state of having softirqs disabled but the internal current->softirqs_enabled is still set. The fix is to manually add the SOFTIRQ_OFFSET to preempt count and call the preempt tracer code outside the lockdep critical area. Thanks to Peter Zijlstra for suggesting this solution. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-23trace: fix logic to start/stop countingSteven Rostedt
The logic in the tracing_start/stop code prevents the WARN_ON from ever detecting if a start/stop pair was mismatched. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-23trace: remove internal irqsoff disabling for trace outputSteven Rostedt
Impact: cleanup of duplicate features The trace output disables the ring buffer and prevents tracing to occur. The code in irqsoff to do the same thing is no longer needed. This patch removes it. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-22ring-buffer: reset timestamps when ring buffer is resetSteven Rostedt
Impact: fix bad times of recent resets The ring buffer needs to reset its timestamps when reseting of the buffer, otherwise the timestamps are stale and might be used to calculate times in the buffer causing funny timestamps to appear. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-22ring-buffer: reset timestamps when ring buffer is resetSteven Rostedt
Impact: fix bad times of recent resets The ring buffer needs to reset its timestamps when reseting of the buffer, otherwise the timestamps are stale and might be used to calculate times in the buffer causing funny timestamps to appear. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-22wakeup-tracer: show scheduling data in outputSteven Rostedt
Impact: better data for wakeup tracer This patch adds the wakeup and schedule calls that are used by the scheduler tracer to make the wakeup tracer more readable. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-22trace: separate out rt tasks from wakeup tracerSteven Rostedt
Impact: add option to trace all tasks or just RT tasks The current wakeup tracer only traces RT task wakeups. This is fine for those interested in wake up timings of RT tasks, but it is useless for those that are interested in the causes of long wakeups for non RT tasks. This patch creates a "wakeup_rt" to implement the tracing of just RT tasks (as the current "wakeup" does). And makes "wakeup" now trace all tasks as an average developer would expect. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-22ring-buffer: do not swap if recording is disabledSteven Rostedt
If the ring buffer recording has been disabled. Do not let swapping of ring buffers occur. Simply return -EAGAIN. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-22trace: do not disable wake up tracer on output of traceSteven Rostedt
Impact: fix to erased trace output To try not to have the outputing of a trace interfere with the wakeup tracer, it would disable tracing while the output was printing. But if a trace had started when it was disabled, it can show a partial trace. To try to solve this, on closing of the tracer, it would clear the trace buffer. The latency tracers (wakeup and irqsoff) have two buffers. One for recording and one for holding the max trace that is printed. The clearing of the trace above should only affect the recording buffer. But for some reason it would move the erased trace to the print buffer. Probably due to a race with the closing of the trace and the saving ofhe max race. The above is all pretty useless, and if the user does not want the printing of the trace to be traced itself, then the user can manual disable tracing. This patch removes all the code that tries to keep the output of the tracer from modifying the trace. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21trace: set max latency variable to zero on defaultSteven Rostedt
Impact: trace max latencies on start of latency tracing This patch sets the max latency to zero whenever one of the irq variant tracers or the wakeup tracer is set to current tracer. Most developers expect to see output when starting up a latency tracer. But since the max_latency is already set to max, and it takes a latency greater than max_latency to be recorded, there is no trace. This is not the expected behavior and has even confused myself. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21trace: stop all recording to ring buffer on ftrace_dumpSteven Rostedt
Impact: limit ftrace dump output Currently ftrace_dump only calls ftrace_kill that is a fast way to prevent the function tracer functions from being called (just sets a flag and clears the function to call, nothing else). It is better to also turn off any recording to the ring buffers as well. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21trace: print ftrace_dump at KERN_EMERG log levelSteven Rostedt
Impact: fix to print out ftrace_dump when expected I was debugging a hard race condition to only find out that after I hit the race, my log level was not at level to show KERN_INFO. The time it took to trigger the race was wasted because I did not capture the trace. Since ftrace_dump is only called from kernel oops (and only when it is set in the kernel command line to do so), or when a developer adds it to their own local tree, the log level of the print should be at KERN_EMERG to make sure the print appears. ftrace_dump is not called by a normal user setup, and will not add extra unwanted print out to the console. There is no reason it should be at KERN_INFO. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21ring_buffer: reset write when reserve buffer failLai Jiangshan
Impact: reset struct buffer_page.write when interrupt storm if struct buffer_page.write is not reset, any succedent committing will corrupted ring_buffer: static inline void rb_set_commit_to_write(struct ring_buffer_per_cpu *cpu_buffer) { ...... cpu_buffer->commit_page->commit = cpu_buffer->commit_page->write; ...... } when "if (RB_WARN_ON(cpu_buffer, next_page == reader_page))", ring_buffer is disabled, but some reserved buffers may haven't been committed. we need reset struct buffer_page.write. when "if (unlikely(next_page == cpu_buffer->commit_page))", ring_buffer is still available, we should not corrupt it. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21tracing/function-graph-tracer: fix a regression while suspend to diskFrederic Weisbecker
Impact: fix a crash while kernel image restore When the function graph tracer is running and while suspend to disk, some racy and dangerous things happen against this tracer. The current task will save its registers including the stack pointer which contains the return address hooked by the tracer. But the current task will continue to enter other functions after that to save the memory, and then it will store other return addresses, and finally loose the old depth which matches the return address saved in the old stack (during the registers saving). So on image restore, the code will return to wrong addresses. And there are other things: on restore, the task will have it's "current" pointer overwritten during registers restoring....switching from one task to another... That would be insane to try to trace function graphs at these stages. This patch makes the function graph tracer listening on power events, making it's tracing disabled for the current task (the one that performs the hibernation work) while suspend/resume to disk, making the tracing safe during hibernation. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20ring-buffer: fix alignment problemSteven Rostedt
Impact: fix to allow some archs to use the ring buffer Commits in the ring buffer are checked by pointer arithmetic. If the calculation is incorrect, then the commits will never take place and the buffer will simply fill up and report an error. Each page in the ring buffer has a small header: struct buffer_data_page { u64 time_stamp; local_t commit; unsigned char data[]; }; Unfortuntely, some of the calculations used sizeof(struct buffer_data_page) to know the size of the header. But this is incorrect on some archs, where sizeof(struct buffer_data_page) does not equal offsetof(struct buffer_data_page, data), and on those archs, the commits are never processed. This patch replaces the sizeof with offsetof. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20trace_workqueue: use percpu data for workqueue statLai Jiangshan
Impact: use percpu data instead of a global structure Use: static DEFINE_PER_CPU(struct workqueue_global_stats, all_workqueue_stat); instead of allocating a global structure. percpu data also works well on NUMA. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20x86, ftrace, hw-branch-tracer: change trace formatMarkus Metzger
Change the hw-branch-tracer format to be more readable. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20x86, ftrace, hw-branch-tracer: reset trace buffer on closeMarkus Metzger
Reset the ftrace buffer on close. Since we use cyclic buffers, the trace is not contiguous, anyway. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20x86, ftrace, hw-branch-tracer: dump trace on oopsMarkus Metzger
Dump the branch trace on an oops (based on ftrace_dump_on_oops). Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20x86, ftrace, hw-branch-tracer: support hotplug cpusMarkus Metzger
Support hotplug cpus. Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18Merge branch 'tracing/ftrace'; commit 'v2.6.29-rc2' into tracing/coreIngo Molnar
2009-01-16Merge branch 'misc' into releaseLen Brown
2009-01-16kprobes: check CONFIG_FREEZER instead of CONFIG_PMMasami Hiramatsu
Check CONFIG_FREEZER instead of CONFIG_PM because kprobe booster depends on freeze_processes() and thaw_processes() when CONFIG_PREEMPT=y. This fixes a linkage error which occurs when CONFIG_PREEMPT=y, CONFIG_PM=y and CONFIG_FREEZER=n. Reported-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-16PM: Fix freezer compilation if PM_SLEEP is unsetRafael J. Wysocki
Freezer fails to compile if with the following configuration settings: CONFIG_CGROUPS=y CONFIG_CGROUP_FREEZER=y CONFIG_MODULES=y CONFIG_FREEZER=y CONFIG_PM=y CONFIG_PM_SLEEP=n Fix this by making process.o compilation depend on CONFIG_FREEZER. Reported-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-16trace: set max latency variable to zero on defaultSteven Rostedt
Impact: trace max latencies on start of latency tracing This patch sets the max latency to zero whenever one of the irq variant tracers or the wakeup tracer is set to current tracer. Most developers expect to see output when starting up a latency tracer. But since the max_latency is already set to max, and it takes a latency greater than max_latency to be recorded, there is no trace. This is not the expected behavior and has even confused myself. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16ftrace: remove static from function tracer functionsSteven Rostedt
Impact: clean up After reorganizing the functions in trace.c and trace_function.c, they no longer need to be in global context. This patch makes the functions and one variable into static. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16ftrace: combine stack trace in function callSteven Rostedt
Impact: less likely to interleave function and stack traces This patch does replaces the separate stack trace on function with a record function and stack trace together. This will switch between the function only recording to a function and stack recording. Also some whitespace fix ups as well. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16trace: clean up format errors in calls to trace_seq_printfSteven Rostedt
After adding the printf format checking for trace_seq_printf, several warnings now show up. This patch cleans them up. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16trace: add gcc printf check to trace_seq_printfSteven Rostedt
Andrew Morton suggested adding a printf checker to trace_seq_printf since there are a number of users that have improper format arguments. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16ftrace: move function tracer functions out of trace.cSteven Rostedt
Impact: clean up of trace.c The function tracer functions were put in trace.c because it needed to share static variables that were in trace.c. Since then, those variables have become global for various reasons. This patch moves the function tracer functions into trace_function.c where they belong. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16ftrace: add stack trace to function tracerSteven Rostedt
Impact: new feature to stack trace any function Chris Mason asked about being able to pick and choose a function and get a stack trace from it. This feature enables his request. # echo io_schedule > /debug/tracing/set_ftrace_filter # echo function > /debug/tracing/current_tracer # echo func_stack_trace > /debug/tracing/trace_options Produces the following in /debug/tracing/trace: kjournald-702 [001] 135.673060: io_schedule <-sync_buffer kjournald-702 [002] 135.673671: <= sync_buffer <= __wait_on_bit <= out_of_line_wait_on_bit <= __wait_on_buffer <= sync_dirty_buffer <= journal_commit_transaction <= kjournald Note, be careful about turning this on without filtering the functions. You may find that you have a 10 second lag between typing and seeing what you typed. This is why the stack trace for the function tracer does not use the same stack_trace flag as the other tracers use. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16ftrace: fix trace_outputLai Jiangshan
Impact: fix bug for handling partial line trace_seq_printf(), seq_print_userip_objs(), ... return 0 -- partial line was written other(>0) -- success duplicate output is also removed in trace_print_raw(). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: sched_slice() fixlet sched: fix update_min_vruntime sched: SCHED_OTHER vs SCHED_IDLE isolation sched: SCHED_IDLE weight change sched: fix bandwidth validation for UID grouping Revert "sched: improve preempt debugging"
2009-01-15resources: fix parameter name and kernel-docRandy Dunlap
Fix __request_region() parameter kernel-doc notation and parameter name: Warning(linux-2.6.28-git10//kernel/resource.c:627): No description found for parameter 'flags' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-15cgroups: consolidate cgroup documentsLi Zefan
Move Documentation/cpusets.txt and Documentation/controllers/* to Documentation/cgroups/ Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-15sched: sched_slice() fixletLin Ming
Mike's change: 0a582440f "sched: fix sched_slice())" broke group scheduling by forgetting to reload cfs_rq on each loop. This patch fixes aim7 regression and specjbb2005 regression becomes less than 1.5% on 8-core stokley. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Jayson King <dev@jaysonking.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15[IA64] dump stack on kernel unaligned warningsDoug Chapman
Often the cause of kernel unaligned access warnings is not obvious from just the ip displayed in the warning. This adds the option via proc to dump the stack in addition to the warning. The default is off (just display the 1 line warning). To enable the stack to be shown: echo 1 > /proc/sys/kernel/unaligned-dump-stack Signed-off-by: Doug Chapman <doug.chapman@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2009-01-15sched: fix update_min_vruntimePeter Zijlstra
Impact: fix SCHED_IDLE latency problems OK, so we have 1 running task A (which is obviously curr and the tree is equally obviously empty). 'A' nicely chugs along, doing its thing, carrying min_vruntime along as it goes. Then some whacko speed freak SCHED_IDLE task gets inserted due to SMP balancing, which is very likely far right, in that case update_curr update_min_vruntime cfs_rq->rb_leftmost := true (the crazy task sitting in a tree) vruntime = se->vruntime and voila, min_vruntime is waaay right of where it ought to be. OK, so why did I write it like that to begin with... Aah, yes. Say we've just dequeued current schedule deactivate_task(prev) dequeue_entity update_min_vruntime Then we'll set vruntime = cfs_rq->min_vruntime; we find !cfs_rq->curr, but do find someone in the tree. Then we _must_ do vruntime = se->vruntime, because vruntime = min_vruntime(vruntime := cfs_rq->min_vruntime, se->vruntime) will not advance vruntime, and cause lags the other way around (which we fixed with that initial patch: 1af5f730fc1bf7c62ec9fb2d307206e18bf40a69 (sched: more accurate min_vruntime accounting). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Mike Galbraith <efault@gmx.de> Acked-by: Mike Galbraith <efault@gmx.de> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15sched: SCHED_OTHER vs SCHED_IDLE isolationPeter Zijlstra
Stronger SCHED_IDLE isolation: - no SCHED_IDLE buddies - never let SCHED_IDLE preempt on wakeup - always preempt SCHED_IDLE on wakeup - limit SLEEPER fairness for SCHED_IDLE. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15sched: SCHED_IDLE weight changePeter Zijlstra
Increase the SCHED_IDLE weight from 2 to 3, this gives much more stable vruntime numbers. time advanced in 100ms: weight=2 64765.988352 67012.881408 88501.412352 weight=3 35496.181411 34130.971298 35497.411573 Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15sched: fix bandwidth validation for UID groupingPeter Zijlstra
Impact: make rt-limit tunables work again Mark Glines reported: > I've got an issue on x86-64 where I can't configure the system to allow > RT tasks for a non-root user. > > In 2.6.26.5, I was able to do the following to set things up nicely: > echo 450000 >/sys/kernel/uids/0/cpu_rt_runtime > echo 450000 >/sys/kernel/uids/1000/cpu_rt_runtime > > Seems like every value I try to echo into the /sys files returns EINVAL. For UID grouping we initialize the root group with infinite bandwidth which by default is actually more than the global limit, therefore the bandwidth check always fails. Because the root group is a phantom group (for UID grouping) we cannot runtime adjust it, therefore we let it reflect the global bandwidth settings. Reported-by: Mark Glines <mark@glines.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15tracing: trace_stat.c cleanupIngo Molnar
Impact: cleanup - whitespace / code alignment cleanups - avoid unnecessary forward prototype by reordering functions Signed-off-by: Ingo Molnar <mingo@elte.hu>