linux.git - Linux Kernel

Age	Commit message (Collapse)	Author
2016-12-14	LOCAL / sched: Add nr_running_cpumask() to get the number of running tasks ↵	Chanwoo Choi
	of per-cluster This patch adds the nr_running_cpumask() function to get the number of runing tasks of per-cluster. Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2016-12-14	sched: hmp: fix spinlock recursion in active migration	Kevin Hilman
	[original commit message] Commit cd5c2cc93d3d (hmp: Remove potential for task_struct access race) introduced a put_task_struct() to prevent races, but in doing so introduced potential spinlock recursion. (This change was further onsolidated in commit 0baa5811bacf -- sched: hmp: unify active migration code.) Unfortunately, the put_task_struct() is done while the runqueue spinlock is held, but put_task_struct() can also cause a reschedule causing the runqueue lock to be acquired recursively. To fix, move the put_task_struct() outside the runqueue spinlock. [additional commit message by Chanwoo Choi] We did not apply hmp patch[1] because patch[1] clean the code by sharing the same code. When I applied hmp patch[1], scheduling problem issue occured. [1] commit 0baa5811bacf -- sched: hmp: unify active migration code.) So, this patch move the put_task_struct() just outside the runqueue spinlock. Reported-by: Victor Lixin <victor.lixin@hisilicon.com> Cc: Jorge Ramirez-Ortiz <jorge.ramirez-ortiz@linaro.org> Cc: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Kevin Hilman <khilman@linaro.org> Reviewed-by: Jon Medhurst <tixy@linaro.org> Reviewed-by: Alex Shi <alex.shi@linaro.org> Reviewed-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [cw00.choi: Fix the merge conflict] Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2016-12-14	hmp: Restrict ILB events if no CPU has > 1 task	Chris Redpath
	Frequently in HMP, the big CPUs are only active with one task per CPU and there may be idle CPUs in the big cluster. This patch avoids triggering an idle balance in situations where none of the active CPUs in the current HMP domain have > 1 tasks running. When packing is enabled, only enforce this behaviour when we are not in the smallest domain - there we idle balance whenever a CPU is over the up_threshold regardless of tasks in case one needs to be moved. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2016-12-14	HMP: Do not fork-boost tasks coming from PIDs <= 2	Chris Redpath
	System services are generally started by init, whilst kernel threads are started by kthreadd. We do not want to give those tasks a head start, as this costs power for very little benefit. We do however wish to do that for tasks which the user launches. Further, some tasks allocate per-cpu timers directly after launch which can lead to those tasks being always scheduled on a big CPU when there is no computational need to do so. Not promoting services to big CPUs on launch will prevent that unless a service allocates their per-cpu resources after a period of intense computation, which is not a common pattern. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2016-12-14	HMP: use per cpu cpuidle driver to fix deadlock in hmp_idle_pull	Alex Shi
	Using per cpu cpuidle driver to fix deadlock in hmp_idle_pull. Otherwise a deadlock happened when do bl_idle_init. [ 113.878664] other info that might help us debug this: [ 113.878667] Possible unsafe locking scenario: [ 113.878667] [ 113.878670] CPU0 [ 113.878673] ---- [ 113.878681] lock(cpuidle_driver_lock); [ 113.878684] <Interrupt> [ 113.878691] lock(cpuidle_driver_lock); [ 113.878693] [ 113.878693] * DEADLOCK * [ 113.878693] [ 113.878697] 1 lock held by ksoftirqd/4/28: [ 113.878719] #0: (hmp_force_migration){+.....}, at: [<c0054da5>] hmp_idle_pull+0x49/0x508 This patch is just a quick/cheap workaround for cpuidle_driver_lock deadlock. It works for TC2 and any other platform where the idle driver cannot be changed at runtime. Signed-off-by: Alex Shi <alex.shi@linaro.org> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2016-12-14	sched: hmp: fix out-of-range CPU possible	Chris Redpath
	If someone hotplugs all the little CPUs while another CPU is handling a wakeup, we can potentially return new_cpu == NR_CPUS from hmp_select_slower_cpu (which is called internally by hmp_best_little_cpu as well). We will use this to deref the per_cpu rq array in hmp_next_down_delay which can go boom. Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2016-12-14	hmp: dont attempt to pull tasks if affinity doesn't allow it	Chris Redpath
	When looking for a task to be idle-pulled, don't consider tasks where the affinity does not allow that task to be placed on the target CPU. Also ensure that tasks with restricted affinity do not block selecting other unrestricted busy tasks. Use the knowledge of target CPU more effectively in idle pull by passing to hmp_get_heaviest_task when we know it, otherwise only checking for general affinity matches with any of the CPUs in the bigger HMP domain. We still need to explicitly check affinity is allowed in idle pull since if we find no match in hmp_get_heaviest_task we will return the current one, which may not be affine to the new CPU despite having high enough load. In this case, there is nothing to move. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	hmp: Use idle pull to perform forced up-migrations	Chris Redpath
	When a normal forced up-migration takes place we stop the task to be migrated while the target CPU becomes available. This delay can range from 80us to 1500us on TC2 if the target CPU is in a deep idle state. Instead, interrupt the target CPU and ask it to pull a task. This lets the current eligible task continue executing on the original CPU while the target CPU wakes. Use a pinned timer to prevent the pulling CPU going back into power-down with pending up-migrations. If we trigger for a nohz kick, it doesn't matter about triggering for an idle pull since the idle_pull flag will be set when we execute the softirq and we'll still do the idle pull. If the target CPU is busy, we will not pull any tasks. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2016-12-14	sched: hmp: Change small task packing defaults for all platforms	Chris Redpath
	All platforms other than TC2 default to enabling packing. Since TC2 shows no performance or energy degradation with this feature enabled make it default enabled the same as everyone else. Likewise, vendors have been including TC2 support in multi-machine kernel builds so they expect the default thresholds to remain the same when the TC2 #ifdef is removed. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2016-12-14	LOCAL / sched: Fix build break by using alternative function	Chanwoo Choi
	This patch fixes the build break because Linux 4.0 didn't include the cpumask_scnprintf() function. So, this patch use the alternative function (cpumap_print_to_pagebuf()). Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> [k.kozlowski: rebased on 4.1] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	hmp: sched: Clean up hmp_up_threshold checks into a utility fn	Chris Redpath
	In anticipation of modifying the up_threshold handling, make all instances use the same utility fn to check if a task is eligible for up-migration. This also removes the previous difference in threshold comparison where up-migration used '!<threshold' and idle pull used '>threshold' to decide up-migration eligibility. Make them both use '!<threshold' instead for consistency, although this is unlikely to change any results. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2016-12-14	HMP: Fix rt task allowed cpu mask restriction code on 1x1 system	Dietmar Eggemann
	There is an error scenario where on a 1x1 HMP system (weight of the hmp_slow_cpu_mask is 1) the short-cut of restricting the allowed cpu mask of an rt tasks leads to triggering a kernel bug in the rt sched class set_cpus_allowed function set_cpus_allowed_rt(). In case the task is on the run-queue and the weight of the required cpu mask is 1 and this is different to the p->nr_cpus_allowed value, this back-end function interprets this in such a way that a task changed from being migratable to not migratable anymore and decrements the rt_nr_migratory counter. There is a BUG_ON(!rq->rt.rt_nr_migratory) check in this code path which triggers in this situation. To circumvent this issue, set the number of allowed cpus for a task p to the weight of the hmp_slow_cpu_mask before calling do_set_cpus_allowed() in __setscheduler(). It will be set to this value in do_set_cpus_allowed() after the call to the sched class related backend function any way. By doing this, set_cpus_allowed_rt() returns without trying to update the rt_nr_migratory counter. This patch has been tested with a test device driver requiring a threaded irq handler on a TC2 system with a reduced cpu mask (1 Cortex A15, 1 Cortex A7). Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2016-12-14	HMP: Restrict irq_default_affinity to hmp_slow_cpu_mask	Dietmar Eggemann
	This patch limits the default affinity mask for all irqs to the cluster of the little cpus. This patch has the positive side effect that an irq thread which has its IRQTF_RUNTHREAD set inside irq_thread() -> irq_wait_for_interrupt() will not overwrite its struct task_struct->cpus_allowed with a full cpu mask of desc->irq_data.affinity in irq_thread_check_affinity() essentially reverting patch "HMP: experimental: Force all rt tasks to start on little domain." for this irq thread. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2016-12-14	sched: hmp: Fix potential task_struct memory leak	Chris Redpath
	We use get_task_struct to increment the ref count on a task_struct so that even if the task dies with a pending migration we are still able to read the memory without causing a fault. In the case of non-running tasks, we forgot to decrement the ref count when we are done with the task. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2016-12-14	sched: hmp: Change TC2 packing config to disabled default if present	Chris Redpath
	Since TC2 power curves don't really have a utilisation hotspot where packing makes sense, if it is present for a TC2 system at least make it default to disabled. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: hmp: Make idle balance behaviour normal when packing disabled	Chris Redpath
	The presence of packing permanently changed the idle balance behaviour. Do not restrict idle balance on the smallest CPUs when packing is present but disabled. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: update runqueue clock before migrations away	Chris Redpath
	If we migrate a sleeping task away from a CPU which has the tick stopped, then both the clock_task and decay_counter will be out of date for that CPU and we will not decay load correctly regardless of how often we update the blocked load. This is only an issue for tasks which are not on a runqueue (because otherwise that CPU would be awake) and simultaneously the CPU the task previously ran on has had the tick stopped. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: reset blocked load decay_count during synchronization	Chris Redpath
	If an entity happens to sleep for less than one tick duration the tracked load associated with that entity can be decayed by an unexpectedly large amount if it is later migrated to a different CPU. This can interfere with correct scheduling when entity load is used for decision making. The reason for this is that when an entity is dequeued and enqueued quickly, such that se.avg.decay_count and cfs_rq.decay_counter do not differ when that entity is enqueued again, __synchronize_entity_decay skips the calculation step and also skips clearing the decay_count. At a later time that entity may be migrated and its load will be decayed incorrectly. All users of this function expect decay_count to be zero'ed after use. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	genirq: Add default affinity mask command line option	Thomas Gleixner
	If we isolate CPUs, then we don't want random device interrupts on them. Even w/o the user space irq balancer enabled we can end up with irqs on non boot cpus. Allow to restrict the default irq affinity mask. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: hmp: Fix build breakage when not using CONFIG_SCHED_HMP	Chris Redpath
	hmp_variable_scale_convert was used without guards in __update_entity_runnable_avg. Guard it. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Mark Brown <broonie@linaro.org> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: hmp: add read-only hmp domain sysfs file	Chris Redpath
	In order to allow userspace to restrict known low-load tasks to little CPUs, we must export this knowledge from the kernel or expect userspace to make their own attempts at figuring it out. Since we now have a userspace requirement for an HMP implementation to always have at least some sysfs files, change the integration so that it only depends upon CONFIG_SCHED_HMP rather than CONFIG_HMP_VARIABLE_SCALE. Fix Kconfig text to match. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: Avoid using the cpu stopper to stop runnable tasks	Mathieu Poirier
	When migrating a runnable task, we use the CPU stopper on the source CPU to ensure that the task to be moved is not currently running. Before this patch, all forced migrations (up, offload, idle pull) use the stopper for every migration. Using the CPU stopper is mandatory only when a task is currently running on a CPU. Otherwise tasks can be moved by locking the source and destination run queues. This patch checks to see if the task to be moved are currently running. If not the task is moved directly without using the stopper thread. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: Implement task packing for small tasks in HMP systems	Chris Redpath
	If we wake up a task on a little CPU, fill CPUs rather than spread. Adds 2 new files to sys/kernel/hmp to control packing behaviour. packing_enable: task packing enabled (1) or disabled (0) packing_limit: Runqueues will be filled up to this load ratio. This functionality is disabled by default on TC2 as it lacks per-cpu power gating so packing small tasks there doesn't make sense. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	hmp: Remove potential for task_struct access race	Chris Redpath
	Accessing the task_struct can be racy in certain conditions, so we need to only acquire the data when needed. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: HMP: fix potential logical errors	Chris Redpath
	The previous API for hmp_up_migration reset the destination CPU every time, regardless of if a migration was desired. The code using it assumed that the value would not be changed unless a migration was required. In one rare circumstance, this could have lead to a task migrating to a little CPU at the wrong time. Fixing that lead to a slight logical tweak to make the surrounding APIs operate a bit more obviously. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Robin Randhawa <robin.randhawa@arm.com> Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	smp: smp_cross_call function pointer tracing	Chris Redpath
	generic tracing for smp_cross_call function calls Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: HMP: Additional trace points for debugging HMP behaviour	Chris Redpath
	1. Replace magic numbers in code for migration trace. Trace points still emit a number as force=<n> field: force=0 : wakeup migration force=1 : forced migration force=2 : offload migration force=3 : idle pull migration 2. Add trace to expose offload decision-making. Also adds tracing rq->nr_running so that you can look back to see what state the RQ was in at the time. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: HMP: Change default HMP thresholds	Chris Redpath
	When the up-threshold is at 512 on TC2, behaviour looks OK since the graphic-related tasks are very heavy due to lack of a GPU. Increasing the up-threshold does not reduce power consumption. When a GPU is present, graphic tasks are much less CPU-heavy and so additional power may be saved by having a higher threshold. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: Update migration timer when we fork-migrate	Chris Redpath
	Prevents fork-migration adversely interacting with normal migration (i.e. runqueues containing forked tasks being selected as migration targets when there is a better choice available) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: Access runqueue task clocks directly.	Chris Redpath
	Avoids accesses through cfs_rq going bad when the cpu_rq doesn't have a cfs member. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: Implement idle pull for HMP	Chris Redpath
	When an A15 goes idle, we should up-migrate anything which is above the threshold and running on an A7. Reuses the HMP force-migration spinlock, but adds its own new cpu stopper client. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: HMP change nr_running offload metric	Chris Redpath
	rq->nr_running was better than cfs.nr_running, since it includes all tasks actually on the CPU. However, it includes RT tasks which we would rather ignore at this point. Switching to cfs.h_nr_running includes all the CFS tasks but no RT tasks. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: Explicitly implement all-load-is-max-load policy for HMP targets	Chris Redpath
	Experimentally, one of the best policies for HMP migration CPU selection is to completely ignore part-loaded CPUs and only look for idle ones. If there are no idle ones, we will choose the one which was least-recently-disturbed. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: Modify the runqueue stats to add a new child stat	Chris Redpath
	The original intent here was to track unweighted runqueue load with less resolution so we could use the least-recently-disturbed runqueue to choose between 'closely related' load levels. However, after experimenting with the resolution it turns out that the following algorithm is highly beneficial for mobile workloads. In hmp_domain_min_load: * If any CPU is zero, the overall load is zero * If no CPUs are idle, the domain is 'fully loaded' Additionally, the time since last migration count is used to discriminate between idle CPUs. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: track per-rq 'last migration time'	Chris Redpath
	Track when migrations were performed to runqueues. Use this to decide between runqueues as migration targets when run queues in an hmp domain have equal load. Intention is to spread migration load amongst CPUs more fairly. When all CPUs in an hmp domain are fully loaded, the existing code always selects the last CPU as a migration target - this is unfair and little better than doing no selection. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: HMP fix traversing the rb-tree from the curr pointer	Morten Rasmussen
	The hmp_get_{lightest,heaviest}_task() need to use __pick_first_entity() to get a pointer to a sched_entity on the rq. The current is not kept on the rq while running, so its rb-tree node pointers are no longer valid. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: select 'best' task for migration rather than 'current'	Chris Redpath
	When we are looking for a task to migrate up, select the heaviest one in the first 5 runnable on the runqueue. Likewise, when looking for a task to offload, select the lightest one in the first 5 runnable on the runqueue. Ensure task selected is runnable in the target domain. This change is necessary in order to implement idle pull in a sensible manner, but here is used in up-migration and offload to select the correct target task. Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: Check the system has little cpus before forcing rt tasks onto them	Jon Medhurst
	It is sometimes desirable to run a kernel with HMP scheduling enabled on a system which is not big.LITTLE, e.g. when building a multi-platform kernel, or when testing a big.LITTLE system with one cluster disabled. We should therefore allow for the situation where is no little domain. Signed-off-by: Jon Medhurst <tixy@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: experimental: Force all rt tasks to start on little domain.	Dietmar Eggemann
	This patch restricts the allowed cpu mask for rt tasks initially started with a full cpu mask to the little domain. An rt task is specified as real time in __setscheduler() which is finally called for all rt tasks (kernel and user land). In this function we restrict the allowed cpu mask to the little domain. This also prevents that a rt tasks can later be pushed to the big domain because the function find_lowest_rq() will only recognize the allowed cpu mask of a task to find the new cpu the task runs on. Current kludges of the patch: * Since we do not have an API to get the cpu mask of the A7 cluster, hmp_slow_cpu_mask is made global in arm/kernel/topology.c for now. * The watchdog_enable() function calls sched_setscheduler() before kthread_bind() for the cpu specific watchdog kernel threads. The order of these two calls has to be changed to make this patch work. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	LOCAL / sched: Fix build break	Chanwoo Choi
	kernel/sched/fair.c: In function ‘find_new_ilb’: kernel/sched/fair.c:7973:42: error: ‘call_cpu’ undeclared (first use in this function) &((struct hmp_domain *)hmp_cpu_domain(call_cpu))->cpus); Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2016-12-14	sched: Restrict nohz balance kicks to stay in the HMP domain	Chris Redpath
	There is little point in doing a nohz balance kick on a CPU from a different HMP domain, since the unset SD_LOAD_BALANCE flag on the CPU domain level prevents tasks from being balanced across clusters except through the per-task load driven hmp_migrate/hmp_offload paths. Further, the nohz balance kick is actively harmful to power usage if all the tasks fit into the little domain since it causes the big domain to wake up and do a lot of calculation to determine that there is nothing to do. A more generic solution is to walk the sched domain tree and determine the intersection of potential idle balance cpus with visibility of tasks on the current CPU, however HMP domains are more easily accessible. Signed-off-by: Chris Redpath <chris.redpath@arm.com> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: Force new non-kernel tasks onto big CPUs until load stabilises	Chris Redpath
	Initialise the load stats for new tasks so that they do not see the instability in early task life which makes it so hard to decide which CPU is appropriate. Also, change the fork balance algorithm so that the least loaded of the CPUs in the big cluster is chosen regardless of the bigness of the parent task. This is intended to help performance for applications which use many short-lived tasks. Although best practise is usually to use a thread pool, apps which do not do this should not be subject to the randomness of the early stats. We should ignore real-time threads for forking on big CPUs, but it is not possible to figure out if a new thread is real-time or not at the fork stage. Instead, we prevent kernel threads from getting the initial boost - when they later become real-time they will only be on big if their compute requirements demand it. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: Avoid multiple calls to hmp_domain_min_load in fast path	Chris Redpath
	When evaluating a migration we make two calls to hmp_domain_min_load. This is unnecessary if we pass on the target CPU information from the hmp_up_migration path. In hmp_down_migration, we don't consider the load of the target CPUS. Signed-off-by: Chris Redpath <chris.redpath@arm.com> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: Select least-loaded CPU when performing HMP Migrations	Chris Redpath
	The reference patch set always selects the first CPU in an HMP domain as a migration target. In busy situations, this means that the migrated thread cannot make immediate use of an idle CPU but must share a busy one until the load balancer runs across the big domain. This patch uses the hmp_domain_min_load function introduced in global balancing to figure out which of the CPUs is the least busy and selects that as a migration target - in both directions. This essentially implements a task-spread strategy and is intended to maximise performance of migrated threads but is likely to use more power than the packing strategy previously employed. Signed-off-by: Chris Redpath <chris.redpath@arm.com> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	HMP: Use unweighted load for hmp migration decisions	Chris Redpath
	Normal task and runqueue loading is scaled according to priority to end up with a weighted load, known as the contribution. We want the CPU time to be allotted according to priority, but we also want to make big/little decisions based upon raw load. It is common, for example, for Android apps following the dev guide to end up with all their long-running or async action threads as low priority unless they override the AsyncThread constructor. All these threads are such low priority that they become invisible to the hmp_offload routine. Using unweighted load here allows us to maximise CPU usage in busy situations. Signed-off-by: Chris Redpath <chris.redpath@arm.com> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: cfs.nr_running does not contain the intended metric	Chris Redpath
	rq->nr_running is the actual number of runnable tasks we wish to use to determine if a task is alone on a CPU. Signed-off-by: Chris Redpath <chris.redpath@arm.com> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	sched: Basic global balancing support for HMP	Morten Rasmussen
	This patch introduces an extra-check at task up-migration to prevent overloading the cpus in the faster hmp_domain while the slower hmp_domain is not fully utilized. The patch also introduces a periodic balance check that can down-migrate tasks if the faster domain is oversubscribed and the slower is under-utilized. Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	ARM: Fix build breakage when big.LITTLE.conf is not used.	Chris Redpath
	Signed-off-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	ARM: Experimental Frequency-Invariant Load Scaling Patch	Olivier Cozette
	Evaluation Patch to investigate using load as a representation of the amount of POTENTIAL cpu compute capacity used rather than a representation of the CURRENT cpu compute capacity. If CPUFreq is enabled, scales load in accordance with frequency. Powersave/performance CPUFreq governors are detected and scaling is disabled while these governors are in use. This is because when a single-frequency governor is in use, potential CPU capacity is static. So long as the governors and CPUFreq subsystem correctly report the frequencies available, the scaling should self tune. Adds an additional file to sysfs to allow this feature to be disabled for experimentation. /sys/kernel/hmp/frequency_invariant_load_scale write 0 to disable, 1 to enable. Signed-off-by: Chris Redpath <chris.redpath@arm.com> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-12-14	ARM: Change load tracking scale using sysfs	Olivier Cozette
	These functions allow to change the load average period used in the task load average computation through /sys/kernel/hmp/load_avg_period_ms. This period is the time in ms to go from 0 to 0.5 load average while running or the time from 1 to 0.5 while sleeping. The default one used is 32 and gives the same load_avg_ratio computation than without this patch. These functions also allow to change the up and down threshold of HMP using /sys/kernel/hmp/{up,down}_threshold. Both must be between 0 and 1024. The thresholds are divided by 1024 before being compared to the load_avg_ratio. If /sys/kernel/hmp/load_avg_period_ms is 128 and /sys/kernel/hmp/up_threshold is 512, a task will be migrated to a bigger cluster after running for 128ms. Because after load_avg_period_ms the load average is 0.5 and real up_threshold us 512 / 1024 = 0.5. Signed-off-by: Olivier Cozette <olivier.cozette@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> [k.kozlowski: rebased on 4.1, no signed-off-by of previous committer] Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>